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ABSTRACT 

The  progress  reports  of  NASA-sponsored  studies  in  the 
areas  of  space  flight  and  guidance  theory  are  presented. 
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PROGRESS  REPORT  NO.  7 
on  Studies  in  the  Fields  of 
SPACE  FLIGHT  AND  GUIDANCE  THEORY 

SUMMARY 

The  progress  reports  of  NASA-sponsored  studies  in  the 
areas  of  space  flight  and  guidance  theory  are  presented. 

The  studies  are  carried  on  by  several  universities  and 
industrial  companies.  This  progress  report  covers  the 
period  from  July  23,  1964  to  April  1,  1965*  The  contracts 
are  technically  supervised  by  personnel  of  the  Astrodynamics 
and  Guidance  Theory  Division,  Aero-Astrodynamics  Laboratory, 
Marshall  Space  Flight  Center. 


INTRODUCTION 

This  report  contains  fourteen  papers,  the  subject 
matter  of  which  lies  in  the  areas  of  space  flight  and  guidance 
theory.  These  papers  were  written  by  investigators  employed 
at  agencies  under  contract  to  MSFC. 

This  report  is  the  seventh  of  the  "Progress  Reports" 
and  covers  the  period  from  July  23,  1964  to  April  1,  1965. 
Information  given  in  the  earlier  progress  reports  will  not 
be  repeated  here. 

The  agencies  contributing  and  their  fields  of  major 
interest  are: 


Field  of  Interest  Agency 


Optimization  Theory 
(Calculus  of  Variations) 

Vanderbilt  University 

Auburn  University 

Analytical  Mechanics  Associates 

Orbital  Transfer 

North  American  Aviation,  Inc. 
United  Aircraft  Corporation 

Control  Theory 

Honeywell,  Inc. 

Celestial  Mechanics 

University  of  Wisconsin 
Hayes  International  Corporation 

Low  Thrust  Trajectories 

Grumman  Aircraft  Engineering  Corp 

Large  Computer 
Exploitation 

Georgia  Institute  of  Technology 
Southern  Illinois  University 

The  objective  of  this  introduction  is  to  briefly  review 
the  contributions  of  each  agency. 

The  first  paper  by  Dr.  M.  Boyce  and  Mr.  J.  Linnstaedter 
of  Vanderbilt  University  develops  a multiplier  rule  and  ana- 
logues of  the  Weierstrass  and  Clebsch  conditions  for  a multi- 
stage Bolza-Mayer  calculus  of  variations  problem.  The  number 
of  stages  is  fixed,  but  partition  points  defining  stage  boun- 
daries are  variable.  Discontinuities  are  allowed  in  variables 
and  constraint  functions  at  partition  points.  The  constraints 
include  finite  equations  and  inequalities,  as  well  as  differ- 
ential equations,  all  of  which  involve  control  variables.  An 
appendix  to  the  report  summarizes  some  of  the  results  obtained 
by  C.  H.  Denbow,  as  modified  by  R.  W.  Hunt,  for  a generalized 
Bolza  problem. 

The  second  paper  by  Joe  W.  Reece  and  Grady  R.  Harmon  is 
an  application  of  the  necessary  conditions  resulting  from 
the  Pontryagin  Maximum  Principle  to  a particular  model  for 
the  simulation  of  reentry  trajectories.  The  paper  is  a good 
example  of  the  detailed  analysis  needed  to  achieve  a workable 
computational  procedure,  but  the  method  used  to  solve  for  the 
boundary  conditions  is  yet  to  be  incorporated  into  the 
procedure. 
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The  third  paper  by  Henry  J.  Kelley  and  Walter  P.  Denham 
derives  the  necessary  conditions  for  optimal  guidance  poly- 
nomial approximations  by  an  ensemble  averaging  approach. 

The  merits  of  this  approach  can  better  be  evaluated  when  a 
computational  scheme  utilizing  the  derived  necessary  condi- 
tions is  outlined  and  applied  to  a trajectory  analysis 
problem. 

The  fourth  paper  by  Gary  A.  McCue  and  David  F.  Bender 
of  North  American  Aviation  presents  a method  for  the  numeri- 
cal determination  of  optimum  two-impulse  orbital  transfers 
between  inclined  elliptical  orbits.  A numerical  optimization 
technique  termed  "adaptive  steepest  descent"  is  shown  to 
overcome  convergence  difficulties . Results  are  obtained  for 
"almost  target"  coplanar  elliptical  orbits.  Extensions  are 
then  developed  for  strongly  inclined  orbits. 

The  fifth  paper  by  David  P.  Bender  and  Gary  A.  McCue 
of  North  American  Aviation  presents  numerical  and  analytical 
results  concerning  optimum  one-impulse  orbital  transfer 
maneuvers.  Approximate  expressions  for  the  minima  of  the 
one-impulse  maneuvers  are  derived.  Numerical  comparisons 
of  one-impulse  transfers  and  corresponding  optimum  two -impulse 
transfers  are  made.  These  comparisons  show  that  for  a small 
range  of  shapes,  one-impulse  transfers  are  optimal. 

The  sixth  paper  by  Frank  Gobetz  of  United  Aircraft 
Corporation  presents  a study  of  the  minimum  fuel  transfer 
and  rendezvous  between  neighboring  low-eccentricity  orbits 
by  power-limited  rockets.  The  equations  of  motion  are  linear- 
ized in  three  separate  coordinate  systems,  given  a variational 
treatment,  and  solved  in  closed  form.  Both  performance  type 
and  guidance  type  of  solutions  are  presented  in  each  of  the 
three  systems.  By  choosing  an  intermediate  orbit  for  the 
reference  orbit  in  an  application  of  the  linear  theory  to 
interplanetary  transfer,  results  for  Earth-Venus  and  Earth- 
Mars  transfers  are  found  to  agree  well  with  exact  results. 

The  seventh  paper,  submitted  by  E.  B.  Lee  of  Honeywell, 
is  entitled,  "An  Approximation  to  Linear  Bounded  Phase  Coordi- 
nate Control  Problems."  The  technique  employs  a non-negative 
"penalty  function"  which  is  small  for  state  variables  satis- 
fying the  given  constraints,  and  large  outside  of  this  con- 
straint set.  An  optimal  control  problem  is  solved  where,  as 
a terminal  condition,  the  integral  of  the  penalty  function 
is  bounded  by  a small  constraint,  thereby  limiting  the 
excursions  of  the  state  variables  outside  of  the  constraint 
set . 


The  eighth  paper  by  C.  C.  Conley  of  the  University  of 
Wisconsin  studies  the  solutions  of  the  restricted  three-body 
problem  near  those  equilibrium  points  which  are  collinear 
with  the  two  positive  masses.  This  is  done  to  gain  insight 
toward  the  development  of  an  analytic  proof  and  classification 
of  the  periodic  orbits  that  pass  near  these  equilibrium  points, 
which  have  been  discovered  numerically  by  M.  Davidson,  and  also 
to  hopefully  gain  insight  into  the  nature  of  solutions  of  the 
restricted  three-body  problem  in  general.  The  qualitative 
observations  that  are  made  are  all  deduced  from  the  linearized 
equations . 

The  ninth  paper,  by  A.  A.  Nafoosi  and  H.  Passmore  of 
Hayes  International  Corporation,  considers  an  approach  to 
the  analytical  solution  of  the  minimum  fuel  trajectory 
integration  problem  through  the  Hamilton-Jacobi  theory  of 
canonical  transformations.  This  method  replaces  the  ordinary 
differential  equations  of  motion  with  the  Hamilton-Jacobi 
partial  differential  equations.  The  method  of  separation  of 
variables  and  Jacobi's  method  for  solving  partial  differential 
equations  are  discussed  and  applied  to  progressively  more 
realistic  approximations  to  the  minimum  fuel  trajectory  problem. 
This  approach  is  found  to  be  of  limited  usefulness  unless  a 
more  appropriate  transformation  of  the  coordinates  can  be  found 
that  would  produce  a more  easily  solvable  Hamilton-Jacobi 
equation. 

The  tenth  paper,  by  Harry  Passmore,  also  applies  methods 
of  celestial  mechanics  to  the  problem  of  deriving  an  analytical 
solution  to  the  minimum  fuel  trajectory  problem.  By  consider- 
ing the  \ variables  as  coordinates  of  a fictitious  body  rela- 
tive to  the  vehicle,  and  transforming  the  X equations  to 
equations  relative  to  the  same  center  of  attraction  as  the 
vehicle,  equations  analogous  to  the  ..three -body  equations  are 
obtained.  These  equations  are  transformed  to  canonical 
equations  and  solved  by  Delaunay  procedures.  The  solution 
obtained  is  a first  order  approximation  expected  to  be  most 
applicable  to  the  many-orbit  low-thrust  problem  rather  than 
interplanetary  transfer  or  high-thrust  trajectory  integration. 

The  eleventh  paper,  by  Hans  K.  Hinz,  Robert  McGill,  and 
Gerald  Taylor,  and  the  twelfth  paper  by  Paul  Kenneth  and 
Gerald  E.  Taylor,  relate  to  their  numerical  experience  with 
the  generalized  Newton-Raphson  method  reported  on  In  Progress 
Report  No.  5,  as  applied  to  the  low  thrust  two-point  boundary 
value  problem.  Equations  of  motion  in  both  applications  are 
formulated  in  two-dimensional  polar  coordinates.  One  appli- 
cation concerns  geocentric  circular  orbital  transfer.  Simple 


equations  are  given  for  first  values  used  to  begin  the 
iteration.  Successful  results  have  been  achieved  for  tra- 
jectories of  up  to  twenty-one  revolutions  and  correct  to 
four  significant  figures  with  convergence  deteriorating  past 
this  point.  Greater  accuracy  may  be  expected  by  using 
multiple  precision  arithmetic  and  better  numerical  integration 
methods. 

The  second  application  of  the  numerical  method  concerns 
the  interplanetary  trajectory  with  bounded  thrust  magnitude 
and  thrust  angle  used  as  control  variables.  Transit  time  is 
specified  and  mass  is  maximized.  Again,  convergence  has  been 
obtained  to  an  accuracy  of  four  significant  figures  with 
further  possibilities  for  improvement  by  using  better 
numerical  methods . 

It  seems  the  Generalized  Newton-Raphson  Method  shows 
value  for  meeting  specified  end-conditions  for  the  sensitive 
low  thrust  trajectory  optimization  problem,  although  the 
geocentric  spiral  trajectory  sensitivity  may  still  offer 
difficulty. 

The  thirteenth  paper  by  I.  E.  Perlin,  J.  H.  Mackay,  et  al., 
contains  a very  thorough  examination  of  the  many  different 
aspects  of  multivariable  function  approximation  by  least  squares 
techniques.  It  also  contains  some  illuminating  examples  of  the 
mathematical  techniques  which  are  used  for  the  selection  of  a 
few  efficient  estimation  variables  from  a larger  set. 

The  fourteenth  paper  by  Robert  Silber  describes  a 
procedure  for  numerically  computing  the  coefficients  for  the 
Taylor's  series  expansion  of  the  general  solution  of  a normal 
system  of  first  order,  ordinary  differential  equations  in  terms 
of  the  time  on  any  solution  and  the  initial  values  of  the  vari- 
ables and  time  for  that  solution.  The  method  has  appeared 
previously  in  NASA  TM  X-53059  as  part  of  a more  involved  method 
to  compute  a guidance  type  of  solution  for  a system  of  differ- 
ential equations.  The  present  paper  singles  out  the  first 
mentioned  solution  as  possibly  deserving  explicit  mention,  and 
brings  out  the  mathematical  considerations  that  justify  this 
procedure  and  that  are  necessary  for  one  to  make  reasonable 
applications  of  it. 
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NECESSARY  CONDITIONS  FOR  A MULTISTAGE 
BOLZA-MAYER  PROBLEM  INVOLVING  CONTROL  VARIABLES  AND 
HAVING  INEQUALITY  AND  FINITE  EQUATION  CONSTRAINTS 


By  M.  G.  Boyce  and  J.  L.  Linnstaedter 


SUMMARY 


A multiplier  rule  and  analogues  of  the  Weierstrass  and  Clebsch  con- 
ditions are  developed  for  a multistage  Bolza-Mayer  calculus  of  variations 
problem.  The  number  of  stages  is  fixed,  but  partition  points  defining 
stage  boundaries  are  variable.  Discontinuities  are  allowed  in  variables 
and  constraint  functions  at  partition  points.  The  constraints  include 
finite  equations  and  inequalities,  as  well  as  differential  equations  all 
of  which  involve  control  variables. 
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An  Appendix  summarizes  some  of  the  results  obtained  by  C.  H.  Denbow, 
"as  modified  by  R.  W.  Hunt,  for  a generalized  Bolza  problem.  The  Appen- 
dix is  independent  of  the  rest  of  the  paper . 
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NOTATION 


Ranges  of  Subscripts  and  Superscripts 


a = 

U-- 

• >p 

1 , . . . ,m 

cL,r\  = 1, . . . 

= n+m+r 

b = 

1 J • • 

. ,p-l 

g = 

1 } • • • j 0. 

(Z  ,S  = 1 , . . . ,M 

= n+q+r 

c = 

0 j • • 

. 7s 

i = 

1, . . . ,n 

Y = = 

s+m+r 

a,h 

= 1, 

...,r 

k = 

1 , . . • , S 

p = 0,. . . ,K  = 

s+m+r 

Intervals , Regions  and  Arcs 

I interval  t < t < t . 

o - - p 

I sub interval  between  partition  points  t and  t . 

a a — 1 a 

R open  connected  set  in  (t,x,y)  space, 
a 

S open  connected  set  in  (tQ,..  .,tp,x(tQ) ,x(t")  ,x(t^) y . . . ,x (t  ) ) space. 

r 

R open  connected  set  in  (t,z,z)  space, 

a 

S open  connected  set  in (tQ, ... ,t^, z (tQ) , z (t^ ), z (t^) z (t  ) ) space. 

C admissible  sub-arc . 

a 

E admissible  arc. 

T 

C admissible  sub-arc  for  transformed  and  Appendix  problems, 
a 

r 

E admissible  arc  for  transformed  and  Appendix  problems. 


Functions  and  Variables 


independent  variable. 


t , 
o' 


x 


M 


g 


,t  partition  set  with  t < t,  < . . . < t . 

9 p o 1 p 

state  variable  vector  (xn  , . . . ,x  ) . 

1'  7 n 

control  variable  vector  (y^, . . . ,y  ) . 

differential  equation  functions , t in  I . 

~ a 

finite  equation  functions,  t in  I . 


L. 

1 


M 


N. 


h 


J 

o 


X 


k* 

v 

D*,D®,A,B 

x(t~),z(t‘) 

x(t*),z(t*) 

dit),ddb 


F 

H 


6 


Tt,  0 

Z,Y 


inequality  constraint  functions,  t in  I . 

differential  equation  functions , t in  I;  L =L  , t in  I . 

l l a 

finite  equation  functions,  t in  I;  M =M^,  t in  I. 
inequality  constraint  functions,  t in  I;  Nh=N^,  t in  I&. 
end  and  intermediate  point  constraint  functions, 
function  to  be  minimized. 

multiplier 'vector  (^ , . . . ,An)  for  differential  equations, 
multiplier  vector  (| u , . . . ) for  finite  equations, 
multiplier  vector  (y  , . ) for  inequality  constraints, 
diagonal  matrices, 
left  hand  limit  at  t,  . 

D 

right  hand  limit  at  t^. 

amount  of  discontinuity  at  t^. 

integration  constants  in  multiplier  rules. 

vector  (z1, . . . ,z^)  for  transformed  and  Appendix  problems. 

differential  equation  functions  for  z-system  problems. 

end  and  intermediate  conditions  for  z-system. 

function  to  be  minimized  for  z-system. 

multipliers  for  z-system. 

Lagrangian  function, 
generalized  Hamiltonian  function. 

Weierstrass  E-function. 

constant  multipliers  in  transversality  equations. 

Clebsch  condition  variables. 

Weierstrass  condition  variables. 


INTRODUCTION 


I 

In  1937  0.  H.  Denbow  (reference  l)  formulated  a multistage  genera- 
lization of  the  Bolza.  problem  and  established  necessary  and  sufficient 

* conditions  for  it.  His  method  involved  transforming  the  multistage 
problem  into  a standard  problem  of  Bolza  by  a transformation  due  to 

W.  T.  Reid  and  L.  M.  Graves.  „ The  transformation  requires  that  the  num- 
ber of  stages  be  fixed  and  the  staging  points  be  distinct.  By  stages 
we  refer  to  the  sub intervals  into  which  the  range  of  the  independent 
variable  is  partitioned  by  intermediate  points  involved  in  the  constraints. 

R.  W.  Hunt  (2)  has  applied  Denbow Ts  methods  to  a Mayer  form  of  the 

* 

multistage  problem  in  which  discontinuities  are  permitted  in  the  vari- 
ables and  constraints  at  staging  points.  He  obtained  the  first  three 

• necessary  conditions.  We  have  summarized  his  results  in  the  Appendix, 

with  some  minor  modifications. 

In  this  paper  we  further  extend  the  work  of  Denbow  and  Hunt  to  in- 
clude control  variables,  finite  equation  conditions,  and  inequality  con- 
straints. Following  Hunt,  we  use  the  Mayer  formulation,  which  Bliss  (3, 
p.  190)  has  shown  equivalent  to  the  Bolza  form  for  one  stage  problems. 

Also  we  have  used  differential  constraints  in  normal  form,  a form  di- 

% rectly  applicable  to  trajectory  optimization.  Hestenes  (4,  pp.  4-6)  has 

t shown  that  the  one  stage  problem  in  this  form  with  control  variables  is 

* reducible  to  the  usual  form  of  the  Bolza  problem  and  vice-versa.  The 

method  of  Valentine  (5)  is  used  to  transform  the  inequality  constraints 

into  differential  equations . 
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FORMULATION  OF  THE  PROBLEM 


Let  t be  the  independent  variable.  Define  a set  of  variables 

(tQ,  . ..,  t ) contained  in  the  range  of  t to  be  a partition  set  if 

and  only  if  t < tn  < . . . < t . Call  the  elements  of  the  partition 
o 1 p 

set  partition  points . Let  I denote  the  interval  tQ  < t < t , and 

let  I denote  the  sub-interval  t _ < t < t for  a = 1.  ....  p - 1 
a a-1  — a 

and  t _ < t < t for  a = p. 

Let  x(t)  denote  the  set  of  functions  (x  (t),  . ..,  x^(t)).  For 
each  i , i = 1,  . . . , n,  assume  x^(t)  to  be  continuous  on  I except 
possibly  at  partition  points  b = 1,  . ..,  p - 1,  where  finite  left 

and  right  limits  exist;  denote  these  limits  by  x^(t^)  and  x^(t^), 
respectively.  The  amount  of  discontinuity  of  each  member  of  x(t)  at 
each  partition  point  will  be  assumed  known , and  we  write 


x.  (t,  ) - x.  (t")  - d 
lb  l b lb 


0, 


with  each  d.,  a known  constant.  -Also  we  let  x.(t,  ) = x.(t*).  Thus 
lb  l b i b 

xj_(t)  is  continuous  at  t^  if  and  only  if  d^  = 0. 

Let  y(t)  denote  the  set  (y^(t),  . . . , y^(t) ) , where  y^(t)  is 
piecewise  continuous  on  I,  j = 1,  . ..,  m,  finite  discontinuities  being 
allowed  between,  as  well  as  at,  partition  points.  In  the  formulation  of 
the  problem  the  y.(t)  will  occur  only  as  undifferentiated  variables 
and  will  not  occur  in  the  function  to  be  minimized  nor  in  the  end  and 
intermediate  point  constraints.  Such  variables  are  called  control  vari- 
ables , while  the  x^(t)  are  called  state,  variables . 


Differentiation  with  respect  to  t will  be  indicated  by  a super- 
posed dot  and  partial  derivatives  by  subscript  variables.  Each  subscript 


n 


* 


* 


or  superscript  index  will  always  have  the  range  specified  when  first 
used  (and  given  in  the  table  of  notations),  and  repeated  indices  in  a 
product  will  indicate  summation  unless  the  contrary  is  stated. 

The  problem  will  be  to  find  in  a class  of  admissible  arcs 

x(t),  y(t) , (tQ,  ...,tp),  tQ  < t < tp, 

which  satisfy  differential  equations 

X.  = L®(t,x,y),  t in  I , a = 1,  . ..,  p,  i = 1,  ...,  n, 

-L  _L  d 

finite  equations 

M*(t,x,y)  =0,  g = 1,  . . .,  q, 

O 

inequalities 

M^(t,x,y)  > 0,  h = 1,  . . . , v , q + r < m; 
and  end  and  intermediate  point  conditions 

W ...,  t , x(to),  X(t"),  x(t*),  ...,  x(tp))  = 0, 
k = 1,  ...,  s<  (n+l)(p+l), 

x.^)  - x±(t^)  - dib  =0,  h = 1 , • p - 1, 


one  that  will  minimize 


JQ(to,  ...,  t , x(tQ),  x(t~),  x(t*),  ...,  x (tp) ) ' 
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In  order  to  state  precisely  the  properties  of  the  functions  in- 


volved in  the  problem,  let  R be  an  open  connected  set  in  the  m+n+1 

U 

dimensional  (t,x,y)  space  whose  projection  on  the  t-axis  contains  the 

interval  I , and  let  S be  an  open  connected  set  in  the  2np  + p + 1 
a 

dimensional  space  of  points 


(t  , t , x(tQ),  x(t~),  x(t^). 


X 


(t  ) ) . 


The  functions  La,  Ma,  wf  are  assumed  continuous  with  continuous  par- 
x7  g7  h 

tial  derivatives  through  those  of  third  order  in  R , and  J , J are 

Q,  O K. 

to  have  such  continuity  properties  in  S.  For  each  a,  the  matrix 


„„a 

M 

gyj 

0 

Hf 

D? 

1 hyo 

1 

St 

is  assumed  of  rank  q + r in  R y where  D is  an  r by  r diagonal 

a _l 

matrix  with  Na,  . . . , as  diagonal  elements.  The  matrix 


Fcto  Jctb  JctP  Jcxi(to)  Jcxi(tb}  Jcxi(tb)  Jcxi(yi 


y C— 0 


is  assumed  of  rank  s + 1 in  S. 

An  admissible  set  is  a set  (t,x,y)  in  for  some 

* a 

a = 1,  . p.  An  admissible  sub-arc  C is  a set  of  functions 

■■  ■ a 

x(t),  y(t),  t on  I , with  each  (t,x,y)  admissible,  and  such  that 

a 

x(t)  is  continuous  and  x(t),  y(t)  are  piecewise  continuous  on  I . 

a 

An  admissible  arc  is  a partition  set  (tQ,  ...,  t ) together  with  a set 


of  admissible  sub-arcs  C , a = 1,  . p,  such  that  the  set 
(tQ,  tp,  x(tQ),  x(t-),  x(t*),  x(t  ))  is  in  S. 


THE  MULTIPLIER  RULE 
An  admissible  arc  E for  which 


Jk(tQ,  tp,  x(tQ),  x(t"),  x(t*),  x(t  ))  = 0, 

xi(tb)  ’ xl<\>  - aib  = °> 

is  said  to  satisfy  the  multiplier  rule  if  there  exists  a function 
H(t ,x,y,X^/,V)  = \±L*  - 

with  multipliers  (t)  , JLu_(t ) , (t ) continuous  except  possibly  at 

3_  I g rl  * “ ' ~~  ” 

partition  points  or  corners  of  E,  where  finite  left  and  right  limits 
exist,  such  that  for  each  t in  I , a = 1 , . . . , p, 
t 

(l)  X.  = - / H dt  + ca,  H = 0,  x.  = La,  Ma  = 0,  Na  > 0, 
i J+  x i y,  i i g ' h - ’ 

a-1  1 J 

and  such  that  the  transversal ity  matrix 


(2) 


H(to)  H(t+)-H(t~) 

-H(tp) 

-w  -V<>  + W VV 

Jct  ^"ct, 

o b 

Jct 

P 

^cx.  (t  ) ^cx.  (tb+^cx.  (t,  ) ^cx.(t  ) 
l o l b ix  b i P 

is  of  rank  s + 1.  The  multipliers  are  zero  when  > 0.  Every 

minimizing  arc  E must  satisfy  the  multiplier  rule. 
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A±  (t 


& 

It  may  be  noted  that  the  constants  c_^ 

+ respectively,  of  the  multipliers 
£L  — -L 


are  the  initial  values 

for  the  several  stages. 


Corollary*  Between  corners  of  a minimizing  arc  E the  equations 


X.  = H-  , = -H  , H = 0,  H = 0,  VhH^  = 0 (not  summed) 

i i 'g  h 


hold  and  hence  also 


dH  _ „ 

dt  " V 


To  prove  the  multiplier  rule  we  transform  the  problem  into  a Den- 
bow  problem  of  the  type  treated  by  Hunt  and  summarized  in  the  Appendix. 
Let  z(t)  denote  the  set  (z  (t),  . z^(t)),  N = n + m + r,  where 


zi(t) 


xj,(t), 


Vj(t) 


■/ 


y (t)dt, 


Zn+m+h^ 


= f s/n  (t,x(t)  ,y(t))dt; 


or,  equivalently, 

(3)  z±(t)  =x±(t),  zn+j(t)  = y^(t),  zn+m+h (t ) = s/Nh(t,x(t),y(t)r 

with  initial  conditions  z , . (t  ) = z (t  ) = 0.  Note  that  for  ad- 

n+j  o n+m+h  o 

missible  arcs  z . and  z , are  continuous  on  I.  In  the  defini- 

n+j  n+m+h 

tion  of  z the  superscript  a has  been  omitted  from  N,  . Where 

n+m+h  h 

this  is  done,  it  is  to  be  understood  that  N^(t,x,y)  = N^(t,x,y)  when 
t is  in  I&.  Similar  usage  applies  to  and  M^. 

Denote  the  differential  equations  for  the  transformed  problem  by 


(4)  j^(t,z,z)  = 0,  y3=l,  ...,n  + q + r = M,  t in  I&, 


# 
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where 


» 


t 


4 


9 


0?  = z . - L? (t , z , . . . , z , z ^ , ...  ,i  ) 
ri  i l ’ 1’  ’ n’  n+17  ’ n+m 


(5)  Cg  = *£(t’ v • • • -vvi  ’ • • • ’V.> 


^n+q+h  zn+m+h  ' "h^ ,Z1  ’ ' ‘ ’ ’zn’zn+l  ’ ' " ,zn+nl 


Let  the  conditions  on  end  and  intermediate  points  be  denoted  by 
( 6 ) fy  (tQ ) • • • > t , z (t^ ) , z ( t^ ) , z (t-^ ) y • * • > z (t  ) )=0  , =1  , . • . , s+m+r=K  , 


where 


fk  ~ Jk(to^--tp,Zl(to)-*-Zn(tobzhtit...^n(t1),z1(t1),..., 

Zn(t1),.  • • > Z-^  ("t  ) 9 ' • • 9 Zn  ^ > 


f , . = Z (t  ) , 

S+J  n+j  o 9 


^ s+m+h  Zn+m+h  ^o 


plus  the  following  difference  relations  at  intermediate  points, 

(7)  Zot^tl3^  " z<t^b^  ■ = °c=  • • • , N. 

Note  that  d,  = 0 for  ct  = n + 1,  ....  N,  since  z , . and  z , „ 

etb  * ’ ’ n+j  n+m+h 

are  continuous.  Let  the  transform  of  the  function  J be  denoted  by 

o J 

f . 

o 

Each  point  (t,xn,...,x  ,yn  , . . . ,y  ) of  R transforms  into  a 
’ 1 n’  1 m a 

point  (t,z1,...,z  ,z  z ).  Let  R denote  the  open  set  in 

1 7 n'  n+1 ’ ’ n+m  a 

(2N+l) -dimensional  (t,z,z)  space  whose  restriction  to  the  coordinates 
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(t,z  , . . . ,z  ,z  , . ...  ,z  ) is  the  transform  of  R , the  other  coordi- 


nates of  R having  unlimited  range , - o o to  +oo. 
a 

f 

Let  S denote  an  open  set  in  the  2Np  + p + 1 dimensional  space 
of  points  (tQ, . . . jt^z^)  ,z(t~)  ,z(t*) , . . . ,z(t  ) ) whose  restriction  to 

(tQ  y • • * y zp  (*^Q  ) y • • • y zn  ) y zp  (^p  ^ 9 * * * ’ zn  ^p  ) ’ Zp  (*^p  ) > # * • > zn  ) j " • * > 

z-,  (t  z (t  ))  is  the  transform  of  S and  which  includes  zero 

1 p 7 7 n p 

values  for  zn+1(tQ) > • • • ^N(tQ) • 

An  admissible  set  for  the  transformed  problem  will  be  a set 

» i 

(t,z,z)  in  R for  some  a.  An  admissible  sub -arc  C will  be  a set 
a a 

of  functions  z(t)  on  I having  each  (t,z,z)  admissible,  with 

a 

z(t)  continuous  and  z(t)  piecewise  continuous  on  I . An  admissible 

a 

arc  will  be  a partition  set  (t  , . • . ,t  ) together  with  a set  of  admis- 

t » 

sible  sub -arcs  C whose  end  and  intermediate  points  lie  in  S . 
a 

£ Q £ 

It  follows  from -the  assumptions  about  the  functions 

in  R and  J *.  J.  in  S that  the  functions  0^  and  f , f_,  will 
a k rP  o 7 Y 

r 

have  continuous  partial  derivatives  to  the  third  order  in  regions  R^ 
and  S , respectively.  The  matrix  j|  jj  can  be  readily  verified  to  be 


for  t 

in  I 

a 

since  it 

can  be  written 

I 

- La 
iyj 

0 

0 

Ma 

Syj 

0 

0 

- < 

hyJ 

D2 

■where  I is  an  n by  n identity  matrix  and  is  an  r by  r diagonal 


18 


matrix  with  diagonal  elements  2z  , . , , ...,2z_T.  Also  the  matrix 

n+m+1  ’ N 


fpto  f/>tb  />  tp  f/°z(t0)  f/,z(tb)  f/>z(tb)  fpz{  tp) 


y°  o,i,...,k, 


b ' v ' ' o'  r“'T)' 

is  found  to  be  of  rank  K + 1 . 

The  assumptions  made  in  the  formulation  of  the  problem  in  the  Ap- 
pendix are  thus  established,  and  hence  the  theorems  of  the  Appendix  can 
be  applied.  From  equations  (5)  the  required  function  F becomes 


F(t,z,z,x>,v)  = A.U.  - L“)  + + ^Un+nH-h  - 


Si  Ql  jdi 

where  the  arguments  of  L.,  M , NT  are 

l*  g h 


(t , z,  , • • . , z , z z ) , t in  I , a l,...,p, 

7 1 7 n n+17  7 n+m  a 7 ' 

and  the  multipliers  X.(t),/*,  (t),  V,  (t)  are  continuous  except  possibly 

1 * g n 

at  corners  or  partition  points,  at  which  right  and  left  limits  exist. 

Now  define  a function  H whose  arguments  are 

(t,zn,...,z  ,z  as  follows: 

' 1’  n n+1  n+m  ' 


H = A. I/I1  - u,  M“  + V,  NT. 

ix  H 


, + + y.  Nf. 

g g h h 


The  relationship  between  F and  H is  given  by  the  equation 


F = A.z.  + V.  _ H, 

11  h n+m+h 


and  from  equations  (A-5)  of  the  Appendix 


\ - -/ 


H dt  + c . , 
1 z.  1 

t 1 
a-1 
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TT  «. 

- H.  = G 

z , . n+j 
n+j 

- 2V,  z c®  ,,  (not  summed). 

h n+m+h  n+m+h 


Furthermore,  the  multiplier  rule  given  in  the  Appendix  establishes  the 
existence  of  constants  e not  all  zero,  satisfying  the  transversality 
conditions : 

t 

eJ  + [ A.z.  + 2V,  z2,  , - z ,H.  ] ° = 0, 

ec  ct  LAi  l h n+m+h  n+j  z . J 
o n+j 


e J A + h.z.  + 2«z2  ..  - z .H.  I b = 0, 

- -1-  Lii  h n+m+h  n+j  z^.J  - 


'c  ct 


n+j  t. 


'c  ct. 


Kb 


e J + | A.  z . + 2Uz^ , - z H.  1 = 0, 

- --1-  / ~ h n+m+h  n+j  z^.J 


n+j't 


ecJ=2,  (t  ) - W ° - °- 


ix  o 


5 - f-  H.  I ° = o, 

S+J  L zn+jJ 


e . - I 2V,  z ± ,.  ° = 0,  (not  summed), 

s+m+h  L h n+m+h J 

e (J  /.  + \ + J - [ X.l  ^ = 0, 

c cz.(tb)  CZ  (tb)  L iJt- 

b 

-[-h,  i5-o. 

1 VjJ  tb 

- I 2>(  z I b = 0,  (not  summed) , 

[ h n+m+hj  - 
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e=J-i<y 


- Wt  ■ °- 

P 


[■  \X 


= o, 


2 V z ,,  I =0,  (not  summed), 

h n+m+hj 


cL 

Recalling  the  equations  - H.  = c and  observing  from  the 

Zn+j  n J 

foregoing  equations  that  the  H.  are  continuous  at  partition  points 

a Zn+j 

and  zero  at  t , we  have  c , . = 0 for  each  a and  j . Similarly , 

P n+j 


O ■*/  4 — r» 

h n+m+h  n+m+h 


(not  summed),  and  the  - 2»^  +m+h  are  continuous 


at  partition  points  and  zero  at.  t^j  hence  Gn'+m+]1  = ^ ^or  ea°k  a 

and  h.  Since  z2  ,,  = N.  , this  implies  that  i/,  = 0 when  N,  > 0. 

n+m+h  h 7 h h 

It  now  follows  that 


d.z.  + 2*tz2  - z ,H.  = H, 

Az  l h n+m+h  n+j  z , . 

n+j 


and  the  first  p + 1 transversality  equations  become 


e J , + 

c ct 

o 


r 

[h]  0 


= 0, 


e J + 

0 otb  ■ -t 


["]>-  - 


0, 


e J + 
c ctb  . 


H. 


= 0. 


Thus  we  have  (p+1 ) (n+m+r+1 ) transversality  equations;  but,  since 
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e = 0 for  P > s (i.e.,  the  last  m+r  of  the  e's  are  zero),  these  may 

/°  ' 

be  reduced  to  only  (p+l)(n+l)  transversal ity  equations.  Changing  vari- 
ables to  those  of  the  original  problem  and  writing  this  reduced  set  of 
transversal ity  equations  in  equivalent  matrix  form  completes  the  proof 
of  the  multiplier  rule. 

An  extremal  is  an  admissible  arc  and  set  of  multipliers  satisfying 
equations  (l)  and  such  that  its  functions  x(t),  y(t),  ^.(t),^>(t),  V^(i) 
have  continuous  first  derivatives  except  possibly  at  partition  points, 
where  finite  left  and  right  limits  exist.  An  extremal,  or  sub-arc  of  an 
extremal,  is  called  non-singular  if  the  determinant 


H 


y/e 


M 


Syj 


0 


0 


0 0 


0 


0 A 


0 


0 A B 


is  different  from  zero  along  it,  A and  B being  diagonal  matrices 
with  diagonal  elements  nTn^,  ...,  an(l  •••>  respectively. 

To  define  normal  arcs,  let  the  transversal ity  conditions  be  used  in 
equation  form  involving  constant  multipliers  eQ  , ...,  eg,  as  in  the 
proof  given  for  the  multiplier  rule.  An  admissible  arc  with  a set  of 
multipliers  i/h>  eQ  satisfying  the  multiplier  rule  is  then 

called  normal  if  e = 1.  For  this  value  of  eQ  the  multipliers  are 


« 


* 


* 


r 
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unique.  On  putting  eQ  = 1 in  the  transversal ity  equations  the  follow- 
ing equivalent  matrix  form  is  obtained. 

For  a normal  minimizing  arc  the  transversal ity  matrix 

H(t  )+J  H(th-H(t“)+J  , -H(t  )+J  . -X  (t  )+J  \ 

o ot  b D ot,  v p ot  1 o ox.(t  ) 

o b ^ p 1 o 

At  At  At  ^kx . ( t ) 

o b p 1 o 

—A.  (t,  )+A.  ( t,  )+j  / , + \ + j / , — \ *x  ("t  ) + j / \ 

i b l b ox.(t,  ) ox.(t,  ) p ox . ( t ) 

lb  ib  ^ l p 

Adc.  (tD+Ax.  (t~)  Ax.(t  ) 

1 D 1 D 1 p 

is  of  rank  s. 

Since  the  matrix  is  of  order  s + 1 by  (n+l)(p+l),  the  require- 
ment that  the  rank  be  s imposes  (n+l)(p+l)  - s conditions.  This  is 
one  more  condition  than  is  imposed  by  the  multiplier  rule  as  first 
stated,  the  condition  there  being  sufficient  to  determine  the  multipliers 
only  up  to  an  arbitrary  proportionality  factor. 


WEIERSTRASS  CONDITION 

The  ^ -function  of  the  Appendix  becomes,  on  using  F as  given  by 
equation  (8), 


& =x,z.  + V,  Z2  ..  - H(Z)  -^.z.  - U&2  _ + H(z) 

li  h n+m+h  v, 


i i h n+m+h 


(Z.-z.)A.  + (Z  ,-z  ,)H.  (z)  - (Z  -z  , )‘2jS,  z , , 

l li  v n+j  n+j  z t ' n+m+h  n+m+h  h n+m+h 

n+j 
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where  the  complete  set  of  arguments  in  H(z)  is 

and  ln  H(i)’ 

the  same  except  that-  zn+1 > • • • > Zn+m  replace  w- 

Z = v and  z , = n/n,  , it  follows  from  the  multiplier  rule  that 

n+j  Jj  n+m+h  h* 

along  a minimizing  arc  H.  (z) , ^hz®+m+h>  and  ’h^n+m+h  are  a11  zero* 

n+j  c 

Hence,  after  simplification  of  & , the  Weierstrass  condition  is  that  for 

f 

a normal  minimizing  arc  E the  inequality 


& = mu  - h(z)  ♦ a+m+h  > 0 


must  hold  at  each  element  (t,z,z,A„w,v)  of  E for  all  admissible 

sets  (t,z,Z)  satisfying  M (t_,zZ)  = 0 and  z^+m+h  “ Nh(t,^Z)  = 0.  Let 

zi  be  replaced  by  x.,  zn+j  by  y. , Zn+j  by  Y. , and  Z?+m+h  by 

N^(t,x,Y).  Then,  on  referring  to  the  definition  of  H and- utilizing 

the  facts  that  along  a minimizing  arc  M (t,x,y)  = 0,  J^N^t ,x,y)  = 0 

(not  summed)  and  that  M (t,x,Y)  is  required  to  be  zero  in  the  Weier- 

§ 

strass  condition,  one  can  reduce  the  condition  to  the  following  form. 

Weierstrass  Condition.  For  a normal  minimizing  arc  E the  ineq ual - 

it£ 

> j\iLi  (t,x,Y) 


must  hold  at  each  element  (t,x,y,*,|u/,V)  of  E for  all  admissible  sets 

(t,x,Y)  satisfying  M„(t,xjf)  = 0 and  Nv(t,x,Y)  > 0. 

g n 


CLEBSCH  CONDITION 


To  apply  the  Clebsch  condition  of  the  Appendix  to  our  transformed 


problem  we  need  the  second  partial  derivatives  of  F.  From  equation  (8) 
these  are  found  to  be 


F.  . = 0,  F.  . = - H.  . , F.  . =0, 

ZiZoc  Zn+jZn+e  Zn+jZn+e  Zn+jZn+m+h 


2i/.  if  d = h, 
h ’ 


z z 
n+m+h  n+m+d 


0 if  d/h. 


On  dropping  the  terms  in  the  Clebsch  inequality  having  zero  coefficients, 
re-numbering  the  subscripts  of  the  it's  in  the  remaining  terms  and  de- 
noting the  last  r of  them  by  Q^,  . 6 , we  can  state  that  for  a nor- 
mal minimizing  arc  the  inequality 


H.  . it. 

VjVe  J 


it  + 9^  > 0 

e h h — 


must  hold  at  each  element  of  Er  for  all  it ,9.^, . . . ,6  satisfy- 

ing the  equations  M . it.  = 0,  N.  . ir,  - 2z  t ,.0.  = 0 (h  not 

gz  , . j 7 hz  . j n+m+h  h 

n+j  u n+j  0 

summed) . 

By  the  multiplier  rule,  = 0 at  an  element  where  > 0.  At 

an  element  where  N.  =0,  and  hence  z , . =0,  one  may  choose  6,  4 0 

h 7 n+m+h  7 17  h T 

but  it  , . . . ,it  and  the  remaining  0!s  all  zero.  The  Clebsch  condition 
would  then  imply  y > 0.  Thus,  for  a normal  minimizing  arc  the  multi- 
pliers are  all  non-negative. 

Since  z/  = 0 when  N,  > 0,  it  follows  that  at  elements  of  a mini- 
h h ■ 7 

mizing  arc  where  > 0 the  term  2*^h  inequality 

would  drop  out.  When  = 0 the  term  can  also  be  dropped,  for, 


z would  then  be  zero,  and  the  condition 

n+m+h 

(h  not  summed)  would  be  satisfied  for  any  0 . 


N,  . «.  - 2z  ^ ..  a = 0 

hz  , . j n+m+h  h 

n+j 

In  particular  the  Clebsch 


condition  would  have  to  be  satisfied  with  0=0  provided  N . it  = 0 

h hzn+J  j 

and  M . it  =0.  Thus  the  condition  can  finally  be  stated  in  the  fol- 
gZn+j  j 
lowing  form. 


Clebsch  Condition.  For  a normal  minimizing  arc  E the  inequality 


4 


must  hold  at  each  element  (t ,x,y, A,jU/,v)  of  E for  all  gets 

it  , ...,it  satisfying  M (t,x,y)it  = 0 and  N (t,x,y)it  = 0, 
J-  " oJ'  .1  <J  w i 

J 

where  in  the  last  equation  h ranges  only  over  the  subset  of 
1 , . . . ,r  for  which  N (t,x,y)  = 0. 


4 
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APPENDIX 


t 


f 


This  appendix  gives  the  formulation  of  the  Denbow  problem  as  modi- 
fied by  Hunt  together  with  the  multiplier  rule  and  the  necessary  condi- 
tions analogous  to  those  of  Weierstrass  and  Clebsch.  At  the  expense  of 
some  repetition,  we  have  made  this  appendix  independent  of  the  main  part 
of  the  paper. 

Let  t be  the  independent  variable.  For  fixed  p,  define  a set  of 

variables  (tQ,  t^,  ...,  t ) to  be  a partition  set  if  and  only  if 

t < t < ...  < t . Let  I denote  the  interval  t < t < t and  I 
OJ-  P o — — pa 

the  sub interval  t < t < t for  a = 1,  ...,  p - 1 and 

3 “_L  3 

■fca_2_  S ^ 5 "ta  for  a = p.  Let  z(t)  denote  the  set  of  functions 

•••,  z^(t)),  where  each  z^(t),  CC  = 1 , • N,  is  continuous  on 

I except  possibly  at  partition  points  t1,  ...  t . At  these  points 

right  and  left  limits  z^t"),  z^t*),  ...  z^t*^)  are  assumed  to  exist 

and  we  let  z^Ct^)  = z^Ct*),  b = 1,  . ..,  p - 1. 

The  problem  will  be  to  find  in  a class  of  admissible  arcs 

z(t) , (t  , . . . ,t  ) , t < t < t , 

o P o — — p 

satisfying  differential  equations 

(A-l)  (t,z,z)  = 0,  t in  I , ^ = 1,...,M  < N, 

and  end  and  intermediate  point  conditions 

(A-2)  f/to,...,tp,z(to),z(t’),z(tJ),...,z(t  ))  = 0, 

y = i,  k < (n+i ) (p+i ) , 
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(A-S)  zjth)  - zjt') 


\a  - ° 


one  that  will  minimize 


, z(tQ),  z(t"),  z(t^),  z(tp)). 

Let  R be  an  open  connected  set  in  the  2N+1  dimensional  (t,z,z) 
a 

/a 

space  whose  projection  on  the  t-axis  contains  I . The  functions  jftg 
are  required  to  have  continuous  third  partial  derivatives  in  R&  and 
each  matrix  j[  0f . J|  is  assumed  of  rank  M in  R • Let  S denote  an 

p 

open  connected  set  in  the  2Np+p+l  dimensional  space  of  points 
(tQ, . . . ,t  ,z(tQ),z(t^) ,z(t^) , . . . ,z(tp))  in  which  the  functions 
f j0=  0,  1,  . K have  continuous  third  partial  derivatives  and  the 
matrix 


(A-4) 


is  of  rank  K+l. 

i 

An  admissible  set  is  a set  (t,z,z)  in  R^  for  some  a=l , . . . ,p . 

An  admissible  subarc  C is  a set  of  functions  z(t),  t on  I , with 

a a 

each  (t,z,z)  an  admissible  set  and  such  that  z(t)  is  continuous  and 

I 

z(t)  is  uiecewise  continuous  on  I_  . An  admissible  arc  E is  a parti - 

a 

f 

tion  set  (t  , . . . ,t  ) together  with  a set  of  admissible  subarcs  C , 
a = 1,  . p,  such  that  the  set  (tQ, . . . ,t  ,z(tQ) ,z (t“) ,z (t*) , . . . ,z(tp) ) 

I 

i s in  S • 

Multiplier  Rule.  An  admissible  arc  E'  that  satisfies  equations 
(A-l),  (A-2),  (A-3)  is  said  to  satisfy  the  multiplier  rule  if  there 
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exist  constants  e not  all  zero  and  a function 
_P  * 

F(t,z;z,A)  = (t  z , z ) ; t in 

with  multipliers  /V^(t)  continuous  except  possibly  at  corners  or  dis- 
continuities of  E'  where  left  and  right  limits  exist,  such  that  the  fol- 
lowing equations  hold: 


(A-5)  F 


F.  = 

J 

i 

1 dt 

, a 
+ c , 

V ' 

t 

-i 

Z« 

a 

■ 

t 

e f , 

+ 

• 

z J 

F. 

0 = 0, 

P ptQ 

oc 

Zocf 

+ 

e?„  , 

+ 

z 

F. 

t, 

b ==  0 

oC 

zoc . 

■ t. 

b 

e fnj_ 

+ 

F. 

= 0 

P ft 

P 

°e 

z=c- 

t 

"oc 


- t 


= 0, 


ePitpzJtl)  + f/°zoc(tb})  ' F' 


OC  J 


- 0, 


F. 

z 


<X  J 


= 0. 


Every  minimizing  arc  must  satisfy  the  multiplier  rule. 

An  extremal  is  defined  to  he  an  admissible  arc  and  set  of  multipliers 


z0St) > (■t0,...,tp),  to  < t < tp, 

satisfying  equations  (A-l)  and  (A-5)  and  such  that  the  functions 
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ijt),  ^(t)  have  continuous  first  derivatives  except  possibly  at  par- 
tition points,  where  finite  left  and  right  limits  exist.  An  extremal  is 
non -singular  in  case  the  determinant 


F.  . 

zotz->7 


0 


Si 


OL 


o i,r)  =1,  . . . , N 
p ) % =1,  ...,M 


is  different  from  zero  along  it.  An  admissible  arc  with  a set  of  multi- 
pliers satisfying  the  multiplier  rule  is  called  normal  if  e = 1.  With 

this  value  of  e the  set  of  multipliers  is  unique, 
o 

Weierstrass  Condition.  An  admissible  arc  E'  with  a set  of  multi- 
pliers J'yjC't)  is  said  to  satisfy  the  Weierstrass  condition  if 

£ (t,z,z,\,Z)  = F(t,z,Z,;0  - F(t,z,z,>) 

" <**  " > 0 


holds  at  every  element  (t,z,z,X)  of  E'  for  all  admissible  sets 
(t ,z,Z)  satisfying  the  equations  0^  = 0.  Every  normal  minimizing  arc 
must  satisfy  the  Weierstrass  condition. 

Clebsch  Condition . An  admissible  arc  E'  with  a set  of  multipliers 
X (t)  is  said  to  satisfy  the  Clebsch  condition  if 

e> 

F.  • (t.z.z.A)  7t  jt  > 0 

cC  v 

holds  at  every  element  (t,z,z,X)  of  E'  for  all  sets  (it  , . . . , jtjj) 
satisfying  the  equations 


= 0. 

Every  normal  minimizing  arc  must  satisfy  the  Clebsch  condition. 
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SUMMARY  <i^o£^U 

The  Maximum  Principle  of  Pontryagin  is  used  to  find  the  point-to- 
point  re-entry  trajectory  of  a space  vehicle  with  an  offset  center  of 
gravity  which  will  minimize  the  accumulated  aerodynamic  acceleration . 
The  mathematical  model  used  incorporates  the  yaw  angle  and  the  true 
angle  of  attack  as  control  variables . The  set  of  characteristic  dif- 
ferential equations  is  written  with  both  algebraic  and  differential 
constraints . A computation  procedure  is  devised  so  that  numerical 
solutions  can  be  obtained  on  a digital  computer. 
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Fam  Aerodynamic  force  in  the  missile  system 

Fq  . Gravitational  force  in  the  plumbline  system 

A Projected  cross-sectional  area  of  vehicle 

q Dynamic  pressure 

f (a*  a j)  Vehicle  configuration  function 

Earth's  angular  velocity  vector  in  plumbline  system, 
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Vrm  Relative  velocity  vector  (Missile  System) 

W Velocity  vector  for  abnormal  air  movement  in  plumbline 

system 
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I . INTRODUCTION 


In  this  paper  an  attempt  is  made  to  treat  the  optimum  re-entry 
problem  in  a simplified  dynamical  manner.  The  condition  for  optimality 
is  that  the  integral  f{ DRAG ) ^ dt  be  a minimum  for  fixed  end  points. 

The  first  order  differential  equations  of  translational  motion  and  the 
algebraic  equations  defining  the  relative  velocity  vector  are  the 
constraints . It  is  assumed  that  the  attractive  force  of  the  earth  and 
the  aerodynamic  drag  are  the  only  forces  influencing  the  vehicle's 
motion . The  vehicle  has  an  offset  center  of  gravity  which  aids  maneu- 
verability. The  performance  analysis  is  based  on  the  Pontryagin  fixed 
end  point  problem  with  dual  control  variables  . 


II.  STATEMENT  OF  THE  PROBLEM 


The  problem  herein  presented  is  that  of  determining  from  a given 
class  of  allowable  trajectories  the  best  one  yielding  mission  fulfillment. 

A space  vehicle  is  assumed  to  initiate  a re-entry  into  the  earth's 
atmosphere  from  some  initial  point  above  the  earth's  surface.  The 
influencing  forces  are  the  gravitational  force  of  the  earth  and  the 
aerodynamic  force  created  by  atmospheric  drag . The  prediction  of  the 
vehicle's  performance  is  based  on  the  assumption  that  a control  system 
is  desired  which  will  satisfy  the  following  criteria: 

1.  Minimization  of  the  accumulated  g-forces  on  the  vehicle's 
occupants . 

2.  Capability  of  making  a point  landing. 

In  mathematical  form  the  first  of  these  becomes  the  minimization 
of  the  integral  of  the  square  of  the  total  aerodynamic  acceleration  . 

The  second  can  be  accomplished  by  the  proper  choice  of  the  initial 
auxiliary  variables . 

The  performance  problem  thus  formulated  becomes  the  Pontryagin 
fixed  end  point  problem,  where  the  functional  to  be  minimized  has  as 
constraints  the  first  order  equations  of  motion  and  the  finite  relative 
velocity  equations . The  boundary  conditions  are  the  initial  and  terminal 
values  of  position  and  velocity.  The  yaw  and  true  angles  of  attack  are 


taken  as  control  variables  . 


Additional  assumptions  made  are  as  follows: 

1 . The  earth  is  a rotating  sphere  and  the  inverse  gravity  law 
holds  . 

2 . The  mass  of  the  vehicle  is  invariant  with  respect  to  time  . 
3 . The  vehicle  has  an  offset  center  of  gravity  which  is 

invariant  with  respect  to  the  vehicle  . 

U.  The  center  of  pressure  is  invariant  with  respect  to  the 
center  of  gravity . 


III.  COORDINATE  SYSTEMS 


Three  rectangular  cartesian  coordinate  systems  will  be  used  in 
this  paper.  They  are: 

1.  The  plumbline  space  fixed  coordinate  system 

2.  The  vehicle  fixed  missile  system 

3.  The  aerodynamic  system. 

A.  PLUMBLINE  SYSTEM 

The  plumbline  system,  Figure  1,  has  its  origin  at  the  earth's 
center  with  the  Y axis  parallel  to  the  gravity  gradient  at  the  launch  \ 

point.  The  X axis  is  parallel  to  the  earth  fixed  launch  azimuth  and 
the  Z axis  is  such  as  to  form  a right-handed  system. 

* 

B,  MISSILE  SYSTEM 

The  missile  system,  Figure  1,  is  defined  with  its  origin  at  the 
center  of  gravity  of  the  vehicle  and  its  y^  axis  parallel  to  the  longi- 
tudinal axis  of  the  vehicle.  The  x and  z axes  are  taken  so  as  to  form 

m m 

a right-handed  system  which  is  parallel  to  the  plumbline  system  at  the 
launch  point . 

As  the  vehicle  moves  along  its  trajectory,  the  missile  system 
undergoes  a displacement  with  respect  to  the  plumbline  system.  In 
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flight  the  two  coordinate  systems  are  related  through  Eulerian  angles 
which  are  measured  by  a gimbal . The  flight  direction  of  the  vehicle  is 
defined  by  first  rotating  about  the  Y axis  by  then  about  the  new 
intermediate  x axis  by  and  finally  about  the  z axis  of  the  second 
Intermediate  system  by  . All  three  rotations  are  considered  positive 
counterclockwise  when  viewed  from  the  positive  end  of  the  axis  about 
which  the  rotation  is  taken  (see  Figure  2)  • 

Thus,  a position  vector  in  the  missile  system  may  be  written  in 
terms  of  a position  vector  In  the  plumbline  system  as 


or 


■ icy  uy  i<y  *, 


X 

m 

m 

z 

m__ 

CP  SP  0 
-SP  CP  0 
0 0 1 


10  0 
0 CY  SY 
0 -SY  CY 


CR  0 -SR 
0 1 0 
SR  0 CR 


(1) 


(la) 


where  CP  designates  cosine  etc . Expanding  the  equation  above  gives 


CPCR  + SPSYSR 

SPCY 

-CPSR  + SPSYCR 

-SPCR  + CPSYSR 

CPCY 

SPSR  + CPSYCR 

X = [Ar)]X 

(lb) 

CYSR 

-SY 

CYCR 

C . AERODYNAMIC  SYSTEM 


The  aerodynamic  system  is  defined  with  its  origin  at  the  center  of 
pressure  of  the  vehicle  and  its  y&  axis  coincident  with  the  relative  ve- 
locity vector.  The  x and  z axes  are  chosen  to  form  a right  hand  system. 

a a 
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FIGURE  2.  EULER IAN  ANGLES 


Again,  as  the  vehicle  moves  in  flight,  there  will  be  a displacement 
of  the  missile  and  aerodynamic  coordinate  systems  relative  to  one 
another.  The  direction  of  the  relative  velocity  vector  or  the  y axis 

cl 

may  be  defined  by  the  following  rotations: 

1 , Rotate  the  vehicle  fixed  reference  frame  about  the  y axis 

“'m 

such  that  the  x^  axis  is  brought  to  lie  in  the  plane  which 

contains  the  y ,axis  and  the  relative  velocity  vector.  Denote 
m 

this  angle  as  . 

2 . Rotate  about  the  new  z axis  to  bring  the  y axis  coincident 

m 

with  the  relative  velocity  vector.  Denote  this  angle  as  a. 

This  angle  is  the  so-called  true  angle  of  attack. 

A position  vector  may  now  be  written  in  the  aerodynamic  system  in 
terms  of  the  missile  system  as 


or 


x = -La]  [a  ] x , 
a y m 


(2) 
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(2b) 


Figure  3 illustrates  this  system. 


IV.  BASIC  MECHANICS 


Gravitational  force.  Since  a spherical  earth  was  assumed,  Newton's 
Law  of  Universal  Gravitation  which  gives  us  an  attractive  force  between 
the  earth  and  the  vehicle  is 


GMmX 


(3) 


Aerodynamic  force  . The  aerodynamic  force,  Figure  4,  is  a force 
due  to  atmospheric  drag . It  acts  through  the  center  of  pressure  and 
the  direction  of  the  force  is  always  parallel  and  opposite  to  the 
relative  velocity  vector.  Written  in  the  aerodynamic  system  the  force 
takes  the  following  form: 


F 

a 


In  the  missile  system 


F 

am 


Ua]T  V 


(4) 


(3) 


or 


F 

r-  — 

-F  Sa  Ca 

ainx 

a y 

F 

1 

o 

p 

amy 

a 

F 

F Sa  Sa 

axnz 

L a y] 

The  expression  for  the  magnitude  of  F&  is  taken  to 


(3a) 


be  the  following 


46 


F = Aqf(a,  a ).  A is  the  projected  cross-section  area  of  the  vehicle, 

^ y 

q the  dynamic  pressure,  and  f(a,  a^)  a factor  which  is  determined  by 
the  vehicle's  configuration. 

Since  the  aerodynamic  force  is  dependent  upon  the  relative 
velocity  or  the  flow  of  air  over  the  missile,  it  is  appropriate  at  this 
time  to  discuss  this  flow.  It  is  assumed  that  the  atmosphere  in  the 
large  moves  with  the  earth.  This  gives  at  all  times  an  air  mass 
movement  with  respect  to  the  plumbline  system  of 


X x oa  - W , 

where  W is  used  to  represent  any  abnormal  air  movement  desired  . The 
relative  velocity  vector  in  the  plumbline  system  is  then  given  by 

VR  = i + [X  X a E - W] , (6) 

or 


“ - 

— — 

— — 

~ — 

VRX 

X 

X 

“ex 

w 

X 

VRY 

= 

I 

+ 

Y 

X 

“ey 

- 

WY 

VRZ 

z 

Z 

1 

e 

m 

l « . .. 

_wz_ 

(6a) 


In  the  missile  system  the  relative  velocity  may  be  written  as 


V 

rm 


VR 


V 

rinx 

V 

rmy  , 

V 


rmz 


(7) 


4 


* 


A 
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or  in  terms  of  the  aerodynamic  system  variables 


where 


V 

rm 


[A  ]T  V , 
l aJ  p 3 


(8) 
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V.  EQUATIONS  OF  MOTION 


As  previously  stated , only  gravitational  and  aerodynamic  forces 
are  considered.  Using  Newton's  Second  Law,  the  translational  motion  of 
the  center  of  gravity  with  respect  to  the  plumbline  system  is  given  by 
the  following  set  of  second  order  differential  equations  ♦ 


where 


(9) 


By  making  the  following  change  of  variable,  the  second  order  equations 
of  translational  motion  may  be  reduced  to  first  order . 


(10) 


The  first  order  translational  equations  thus  become 


T 

_-gg  +l^Lf 

|R|3  m 


(ll) 
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the  following  definitions  are  made: 


For  convenience , 


where 


CM 


m am  m 


F'  N , 
a 9 


(12) 


(13) 


or 


- (CFCR  + SPSYSR)  CaCay  + (SPCR  - CPSYSR)  Ca  + CYSRSaSay 
N - | - SPCYCa  Sa  - CPCYCa  - SYSaSa 

y y 

(CPSR  - SPSYCR)  CaySa  - (SPSR  + CPSYCR)  Ca  + CYCRSaSay 


(1U) 


Thus,  the  translational  equations  may  be  written  as 

u - F'  N + gX 
a 


(15) 


A 
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VI.  FORMULATION  OF  THE  VARIATIONAL  PROBLEM 


The  formulation  of  the  variational  problem  requires  further 
consideration  of  the  constraint  equations  emanating  from 

V = [Aj  v,  « [A  ]T  V (16) 

rm  D R a r 

or 


V 

rm 


- w ] w ] 

p y 


(17) 


where 


(18) 


is  the  relative  velocity  vector  referred  to  the  intermediate  system 
located  by  [0^] . The  system  of  equations  (17 ) is  solved  for  $ and 
^ to  yield  (see  Appendix) 
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The  roll  angle,  ^ , is  given  by 


and  is  obtained  by  limiting  and  $ to  real  values.  (See  Appendix.) 

As  expressed  in  the  problem  statement,  it  is  desired  to  determine 

from  a given  class  of  allowable  trajectories  the  best  one  yielding 

mission  fulfillment.  This  is  accomplished  by  finding  among  all  sets  of 

admissible  control  a(t),  a (t)  which  transfer  the  vehicle  from  X to 

y o 

X one  for  which  the  functional: 

tT 

D = f ™ o 

JtQ  [DRAG] ^ dt  (22) 

takes  on  a minimum  value.  In  this  analysis  the  word  drag  will  be  used 
synonymously  with  aerodynamic  acceleration.  Thus  from  Equation  (15) , 

[DRAG]2  = N • F^-N  = (F^) 2 N • N = (F^)2,  (23) 

and 

D = /tT  (F^) 2 dt  ; D = (Fp2  . (2U) 

o 

The  Pontryagin  H function  may  now  be  written  as  follows: 

H=\i-X+Xii-u  + X7D,  (23) 
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where 


~X1 

M 

It 

X 2 

and  X jj  - 

_X3_ 

16_ 

The  i * 1 ...7*  are  the ‘auxiliary  variables  that  are 

incorporated  in  the  same  manner  as  the  Lagrange  multipliers  in  the 
classical  calculus  of  variations.  Substituting  into  H from  Equation  (l£) 
results  in  the  following: 

H = XI  • u + \n  • [F^  N + g X]  + X?(F^)2  . (26) 

The  expressions  for  the  auxiliary  variables  are  obtained  from  the  H 
function  and  take  the  following  form: 


a(  \ TT  • FT) 

= M = F'  ii 

1 ax  a ax 


0F' 

+ (xn  • N)  ^ 


a x 


(27) 


a(Fa)2 


+ X IT  g + ( X XT  * X)  + X y 

11  11  ax  'ax 


_i  - M - X + F- 
*•11  a-  XI  a 
3u 


+ X, 


a(  x I];  • n) 


u 


+ (X  1T  • N) 


a f' 


a u 


d<F;>2 

3 u 


(28) 


X = = o 

X7  3D 


(29) 
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It  is  implied  from  Equation  (29)  that  = constant.  The  equations  to 
be  solved  for  the  control  variables  are  given  below. 


= F'  (T  _ • + (XTJ 

oy  a II  d 0^  II 


a(F;)' 


-§£  - F'  ( t • 4-1)  + ( \ TT 

c3  a a v II  3 a II 


3(F')‘ 

a 

"7  3 a 


Equations  (l£)  and  (19)  through  (21)  are  the  constraint  and 
definition  equations  which  must  be  satisfied*  and  Equations  (27)  through 
(31)  are  the  characteristic  equations.  The  complete  set  of  algebraic 
and  differential  equations  needed  for  the  problem  solution  have  thus 
been  found . The  desired  minimum  drag  re-entry  path  will  thus  be  one 
which  satisfies  all  the  aforementioned  equations  . A closed  form  solution 
to  this  set  of  equations  does  not  seem  probable  nor  is  the  time  spent  in 
searching  for  such  a solution  justifiable  since  numerical  solutions  via 
digital  computers  can  be  achieved  to  almost  any  degree  of  accuracy . 
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VII.  COMPUTATIONAL  PROCEDURE 


Functional  Analysis: 

When  composing  a computational  procedure,,  it  is 
convenient  to  write  the  equations  in  functional  form, 
such  a set. 

= h ^ 

^y  = ^y  (°>  “y*  $r> 

= (a>  ay} 

• ♦ 

U = u (x,  u,  a,  a ) 

H - H (x,  u,  X , a,  ay) 

• • 

X = X (x,  u,  a,  ay) 

9h  3H  ^ 7 n _ A 

= (x>  U)  ^ X 5 a>  °y  = 0 

3 H 3H  /-  - 3 r s n 

T—  = -g—  (x,  u,  ^ X,  a,  ay)  = 0 

Starting  Values: 
m 
GM 
R 

o 


A 


sometimes  found 
Listed  below  is 
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V 

X 

cp 

u 

Yo 

, X ■ 

5 cp 

^cp 

» u0  = 

V 

_z0 

V i 

cp 

w 

X10 

xl*o 

X 10  “ 

X 20 

* x no  “ 

X6o 

X 30 

X 60 

Atmospheric  tables  for  p as  a function  of  altitude . 

Atmospheric  tables  for  W as  a function  of  position  . 

Aerodynamic  tables  for  f(a,  a ) as  a function  of  (a,  a ) . 

y y 

"N”  Line  Computation: 

(1)  Using  starting  values,  iterate  Equations  (30)  and  (31) 
simultaneously  for  a and  a . 

(2)  Use  (a,  ay)  from  Step  (l)  along  with  starting  values  to 
compute  the  following  in  order . 

from  Equation  ( 21 ) 

$ from  Equation  (20) 

(jl  from  Equation  (19) 

u from  Equation  (l£) 

H from  Equation  (26) 
f.  from  Equation  ( 27 ) 

m 

% from  Equation  (28) 
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(3)  Integrate  to  obtain  the  following: 

u for  u for  X 
T.  j for  X -j. 

X I:r  for  X T1 

(U)  Use  integrated  values  from  Step  (3)  as  star-ting  values  for 
the  n + 1 line  . 

Cut-off  Criteria: 

|vR|  < Mach  2 


58 


BIBLIOGRAPHY 


Bliss,  G,  A*  Lectures  on  the  Calculus  of  Variations.  Chicago:  The 

University  of  Chicago  Press,  1946" 

Goldstein,  Herbert.  Classical  Mechanics  . Reading,  Massachusetts: 
Addison-Wesley  Publishing  Company,  Inc.,  1959. 

Kopp,  Richard  E.  Pon  try  agin  Maximum  Principle,  Chapter  7 of  Optimization 
Techniques.  Edited  by  George  Leitmann  . Berkeley,  California: 
Academic  Press,  1961. 

Miner,  W.  E.  Methods  for  Trajectory  Computation,  NASA-Marshall  Space 
Flight  Center,  Internal  Note,  May  10,  1961 . 

Pontryagin,  L.  S.,  et  al . The  Mathematical  Theory  of  Optimal  Processes. 
New  York:  Interscience  Publishers,  19&2. 

Progress  Report  No.  U on  Studies  in  the  Fields  of  Space  Flight  and 
Guidance  Theory,  MTP-AERO-63-65  . NASA-Marshall  Space  Flight 
Center,  September  19,  1963. 

Progress  Report  No.  5 on  Studies  i_n  the  Fields  of  Space  Flight  and 
Guidance  Theory.  NASA-TMX-53°2U,  March  17,  1961+ . 


60 


VIII.  CONCLUSION 


The  Maximum  Principle  has  been  employed  to  study  the  problem  of 
minimizing  the  integral  of  drag  squared  . 

The  cut-off  criterion  on  the  trajectory  was  jv^j  < Mach  2.  This 
is  a reasonable  criterion  since  the  expression  used  for  the  aerodynamic 
force  is  valid  only  for  velocity  > Mach  2.  If  the  desired  terminal 
position  is  not  attained  simultaneously  with  the  cut-off  criterion,  then 
a different  set  of  initial  auxiliary  variables  must  be  chosen . This 
procedure  must  continue  until  all  of  the  terminal  conditions  are  simul- 
taneously satisfied. 

No  procedure  has  been  developed  in  this  paper  for  determining  the 
initial  auxiliary  variables . An  attempt  is  being  made  to  formulate  the 
transversality  conditions  for  the  problem  and  to  apply  the  gradient 
method  as  an  aid  to  numerical  solution  , 

The  problem,  as  formulated,  is  assured  of  a necessary  but  not 
sufficient  condition  for  the  existence  of  an  optimum. 
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APPENDIX 


SOLUTION  FOR  O'  . . AND  Of 

r y p 


A first  algebraic  solution  of  the  set  of  Equations  (17)  for  $ 


and  yields 


-a  V + V 


SP  - 


"\/  v 2 + 

» r-rrnr 


rmy  - rmx  » rmx  rmy 

2 2 
V + V 
rmx  rmy 


2 2 
-a* 


(Al) 


a V 


CP 


+ V "\/  V2  + 
- rmy  V rmx 


rmx  - rmy 


V2  - a2 
rmx  rmy 


2 2 
V + V 
rmx  rmy 


(A2) 


-b  V 


SY 


rmz 


!cyyr 


2 T72 

c - V 

rmz 


,2  , 2 

b + c 


(A3) 


CY  = 


,yyr 


cV  +b\/bt-+c2-V2 
rmz  - v rmz 


,2  x 2 

b + c 


(All) 


First,  $ and  $ are  limited  to  real  values  by  setting  a = 0 

P y 

It  is  easily  shown  that 

V2  4.  \T  2 2 V2  4 2 T7 

V + V -a  = b + c - V 
rmx  rmy  rmz 


The  choice  a = 0 is  allowable  because  of  the  dependency  of  the  set  of 
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Equations  (17).  As  only  two  of  the  angles  are  required  to  locate  a 
vector  in  three-space,  no  unique  solution  exists  for  $ , $ , and  $ . 

p y 

However,  a - 0 provides  that 


SR  = 


RX 


V 


= , CR  = 


RZ 


2 2 

V + V 
RX  RZ 


V 


2 2 

V + V 
RX  RZ 


>(A5) 


-TT  < $ <TT 

— ' r*  — 


Now,  the  corresponding  values  of  (jf  and  $ must  be  unique  . To  settle 

p y 

the  choice  of  sign  in  Equations  (Al)  through  (Al;),  it  is  recalled  that 
the  determinants  of  the  right-handed  rotation  matrices  [0f  ] and  [$  ] 

y p 

must  be  equal  to  unity.  This  eliminates  two  of  the  four  possible  sign 
combinations  in  Equations  (Al)  and  (A2)  . The  choice  between  the  two 
remaining  possibilities  is  made  according  to  the  relation  between  the 
aerodynamic  coordinate  system  and  the  relative  velocity  vector  VR  set 
forth  in  Section  IV.  The  resulting  equations  are 

\ 


V 


SP  = 


rmx 


V 


V 


, CP  = 


rmy 


2 2 
V + V 
rmx  rmy 


V 


2 2 
V + V^ 
rmx  rmy 


>(A6) 


-it  < $ <n 
- *p- 
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and 


-1T<^y<TT  J 

An  additional  result  of  setting  a - 0 can  be  seen  from  Equation  (l8)  as 


-n  <$y<n 


The  limits  on  (j!  , $ , and  0 are  chosen  to  provide  a full 

^ p y 

revolution  of  freedom  and  eliminate  any  excess  motion . 
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An  approach  to  optimal  guidance  synthesis  is  developed  in  which  an  ensemble- 
averaged  second  order  approximation  to  the  performance  function  is  minimized 
subject  to  constraints  on  the  means  and  variances  of  other  functions.  The  minimi- 
zation is  with  respect  to  coefficients  of  assumed  polynomial  approximations  of  a 
linear  feedback  control  law  (in  which  the  state  is  perfectly  known)  and  coefficients 
in  a linear  termination  law.  A brief  comparison  is  drawn  with  deterministic  neigh- 
boring extremal  control.  While  attention  is  directed  mainly  to  first  order  necessary 
conditions,  some  comments  are  made  on  numerical  solution  by  first  and  second 
order  successive  approximation  methods.  Extensions  to  include  disturbances  other 
than  initial  errors  and  to  include  state  estimation  errors  are  discussed  briefly. 
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Introduction 


The  earliest  theoretical  approaches  to  optimal  guidance  (Refs . 1 and  2) 
lead  to  computational  methods  for  synthesizing  linear  feedback  systems  fur- 
nishing an  approximation  optimal  to  second  order  in  an  expansion  about  a 
given  optimal  reference  trajectory.  While  the  resulting  systems  fulfill  their 
theoretical  promise  in  providing  high  performance,  terminal  accuracy  is 
found  to  be  wanting,  and  the  practical  mechanization  of  the  feedback  law  is 
encumbered  by  the  need  for  storing  time-varying  "gains".  Recent  studies  of 
the  terminal  accuracy  problem  (Refs.  3 and  4)  indicate  that  a large  improve- 
ment may  be  realized  by  transverse  state  comparison  with  the  reference  tra- 
jectory and  suggest  that  this  relatively  simple  procedure  may  be  more  effective 
than  the  addition  of  quadratic  terms  in  the  feedback  approximation. 

The  present  paper  reports  an  idea  for  a synthesis  scheme  in  which  an 
ensemble-averaged  second  order  approximation  to  the  performance  index  is 
minimized  with  respect  to  certain  parameters . These  parameters  include  the 
coefficients  in  three  polynomials  in  time  which  are  used  in  place  of  general 
time-varying  functions.  Polynomial  approximations  are  used  for  (1)  the  con- 
trol programs  of  the  optimal  reference  trajectory;  (2)  the  state  variable  his- 
tories of  the  optimal  reference  trajectory;  (3)  the  feedback  gains  for  the 
assumed  linear  feedback  control  system.  Additional  parameters  to  be  opti- 
mized are  the  coefficients  in  an  assumed  linear  rule  for  termination  of  per- 
turbed trajectories!  The  treatment  is  based  upon  the  statistical  methods 
pioneered  in  Refs . 5 and  6 in  connection  with  synthesis  of  optimal  midcourse 
guidance  approximations. 
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Formulation  of  the  Problem 


The  dynamical  system  under  consideration  satisfies 
x = f(x,  u,  t) 


(1) 


where 

x(t)  is  an  n-vector  of  state  variables 
u(t)  is  an  m-vector  of  control  variables 
t is  the  independent  variable  (hereafter  called  time) 

f is  an  n-vector  of  known  functions  of  x,  u,  t 


The  system  operates  over  a finite  time  interval.  The  initial  time  t is  assumed 
fixed,  but  the  initial  state  is  a vector  of  random  variables  with  specified  en- 
semble average  properties.  The  problem  is  to  minimize  the  ensemble  average 
of  a given  function  of  the  terminal  conditions* 

j = e{<p[x(tf  )*  tf3  3 (2) 

subject  to  the  constraints 


e { 4)  [x(  tf ),  tf  3 } = o 

(3) 

e((4>j[x(  tf),tf])2}  = Nj 

(4) 

where  g1*  is  the  j**1  component  of  any  vector  g.  J is  to  be  minimized  while 
specifying  the  means  and  variances  of  the  functions  . 


* t^  is  the  terminal  time 
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It  is  assumed  that  nominal*  control  programs  u(t)  have  been  determined 
which  minimize  <p [x(  t ^),  t^]  while  meeting  constraints  j/)  [ x(  t^. ),  t^. ] = 0.  Thus, 


X = f(x,  u,  t) 

(5) 

i - - 

(6) 

2 

\A  “ = 0 (^— ^ /0  is  assumed) 

du  '2 

du 

(7) 

with  boundary  conditions 

t , x(  t ) specified 

(8) 

j/)[x(tf),  t{]  = 0 

(9) 

t=tf 

(10) 

<xTf>t=tf  - - (If  * fTm)  - 

(11) 

I t — 

T th 

where  ( ) is  the  transpose  of  ( ),  the  ij  element  of  a matrix 

i 

g and  y both  vectors,  is  — r • With  u(t)  and  x(t)  specified, 

dyJ 

analysis  will  be  carried  out  in  terms  of  the  perturbation  quantities 

ag 

ay* 

the 

6u(t) 

and  6x(t),  where,  by  definition 

u(t)  = u(t)  + 6u(t) 

(12) 

x(t)  = x(t)  + 6x(t) 

(13) 

* ( ) is  ( ) evaluated  on  the  nominal  path . 
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The  minimization  of  J is  to  be  carried  out  with  respect  to  a number  of 
parameters  of  the  problem.  One  set  of  these  parameters  appears  in  the  rule 
for  terminating  trajectories  which  must  be  imposed  because  there  is  no  auto- 
matic way  to  determine  tf  on  each  member  of  the  ensemble.  Suppose  that  the 
termination  rule  is  described  by 


fj  Lx(  t ),  1 3 = 0 (14) 

t-tf 

where  O may  be  any  once  differentiable  function  of  x and  t.  Consistent  with 
the  second  order  approximation  theory  to  be  employed,  the  optimality  of  the 
reference  trajectory  leads  to  the  result  that  the  most  general  relevant  in 
the  analysis  is  a linear  function  of  x and  t.  To  first  order,  then,  (14)  may 
be  written  as 

0 = 0[x(tf),tf]  + [^6x3t=j  ' fidtf  (l5) 


where,  by  definition, 


+ m 


f 


Since  tf  = t{  + dtf , the  terminal  time  may  be  determined  from  (15)  provided 
6^0.  This  is  simply  the  statement  that  O must  not  be  a constant  of  the 
motion  if  (15)  is  to  give  a solution  for  t.. 


Solving  (15)  for  dtf  gives 


dt{  = fi  + fix  6x(tf) 
where,  by  definition, 

0[x(tf),tf  ] 

Q = 

(-«) 


(16) 


70 


fi  and  n are  the  parameters  to  be  optimized;  it  is  evident  there  is  no  loss 
of  generality  in  assuming  (2  - l. 


The  system  controls  for  each  member  of  the  ensemble  are  assn med  to 
satisfy 

N Ng  \ 

u(t)  . I a.t’  * l 

i-0  j-0  k 0 

where  N , N and  N are  specified,  a.,  b.,  c are  unspecified.  The  first 
u g x l j k 

term  in  (17)  is  the  polynomial  approximation  to  u(t).  The  second  term  is  the 
result  of  an  assumption  that  the  feedback  control  is  linear  in  x(t).  The  Ee^  t 

is  the  polynomial  approximation  to  x (t).  Eb^  t*  is  the  assumed  form  of  the 
feedback  gain.  The  most  general  linear  feedback  would  use  an  nix  n matrix, 
say  A(t),  of  unspecified  functions  of  time.  Thus,  the  formulation  used  here 
replaces  the  most  general  linear  feedback  control  system,  which  would  require 
storage  of  if(t),  x(t)  and  A(t),  by  a linear  feedback  control  utilizing  poly- 
nomial approximations.  It  may  be  verified  by  inspection  that  a.,  b^,  c^  are 
m x 1,  m x n,  n x 1 matrices  for  each  i,  j,  k respectively. 

The  problem,  then,  is  to  simultaneously  choose  all  parameters  fi,  Cl^9 
a,  b,  e to  minimize  J while  satisfying  the  mean  and  variance  constraints. 


Derivation  of  Necessary  Conditions  for  the  Optimal  Parameters 


The  approach  used  here  will  be  to  adjoin  all  relevant  constraints  to  the 
performance  index  by  means  of  Lagrange  multipliers.  Hence, 


j = efcp[x<tf),tf]}  + i/Te^[x(tf),t{]}  + |Ekj[c{(*Wf>;tf])}  - nj] 


P1  T T 

+ el  (X  + fiX  )(f-x)dt 


(18) 


The  essential  approximation  of  the  analysis  is  the  assumption  of  "small"  per- 
turbations. The  ensemble  of  system  trajectories  is  treated  by  expanding  about 
the  nominal  path  and  keeping  terms  through  quadratic  in  6x  and  6u,  but 
dropping  higher  order  terms.  As  an  example:* 


e£<pCx<tf ),  tf3}  = <p[x(tf),tf]  + e<dx)  + e<dt)\=t{ 

+ \ e[dxT  dx  + dxT  dt  + dt  dx  + dt  — ^ dtl  (19) 

2 L ^ 2 ax  at  at  ax  2 j - 

ax  at  t=t. 


Evaluation  of  (19)  requires  evaluation  of 


But 


d [x( t{ ) ] = x(t{)  - x(tf) 

- r‘f  • 

= 6x(t  ) + x(r)dT 
' \ 


(20) 


* The  ij**1  element  of  a2h/ay  az,  where  h is  scalar  and  y and  z are 
vectors,  is  defined  to  be 


a2h 


ay1  azJ 
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x(r)  = x(r)  + 6x(t) 


= x(tf)  + 


x(t£)(r-  tf) 


+ 


+ 


r- 

Ibx 


fix 


+ 


£f 

Su 


(21) 


Substituting  (21)  into  (20)  and  dropping  terms  above  second  order  gives 
d[x(tf)]  = fix(tf)  + x(t£)dtf  + |x(t£)dt£2 

+ [H 6x  + i£ H,  - dtf  <22> 

t-tf 

Everywhere  in  (19)  that  dt  appears  it  is  replaced  by  f)  + O fix(t  ).*  This 

I XX 

makes  all  terminal  functions  depend  only  on  quantities  evaluated  at  t - t£ . 

The  x terms  in  (18)  are  integrated  by  parts.  The  Lagrange  multipliers 
v are  written 


V = V + dv  (23) 

where  dv  is  assumed  to  be  of  order  fix(t£).  It  is  further  assumed  that  the 
Lagrange  multiplier  functions  6X(t)  are  of  order  6x(t).**  The  Lagrange 
multipliers  are  assumed  to  be  order  one.  These  assumptions  all  rely  on 
the  basic  assumption  that  the  entire  ensemble  of  trajectories  lies  within  an 
adequately  small  neighborhood  of  the  reference  path. 

Expansion  of  (18)  through  second  order  and  grouping  similar  terms  gives 


* O is  assumed  to  be  the  order  of  C [ fix(t^)]  . 

**  Note  that  6X(t)  is  different  on  each  member  of  the  ensemble,  just  as 
6x(t)  is.  X(t)  is  the  same  for  each  member,  given  by  (6)  and  (10). 
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j . * + [ff<l  + i«y  f?«V]  e[6x(tf)]  + [|f  i*  I*]  O 

1 I t_tf 


+ a 


[!f  If - + MSxTrf  * 2CixTtf 

t=t„  OX 


t a*  si 

Bx 


* (if)\  * * *«*>] 


j=i 


6x}  _ + n{$fix+  |f  H + (ff)  + ikj  +^V\  - 

t=t£  J=1  f 


p p 2 

e[6x(tf)]  - ^ k.  Nj  + ^n2[£  + ^k.  (0j)  ] _ + diyT[(f^  + inx> 

l=i  J ‘ 


P , 2. 
J-l 


C(fix)  + _ - [\TP'(  6x)]fc  - + [XTR(6x)3t  t - e[6XT6xlt  = - 


t-t 


' O 


C[6X^6x]^  + C J [(XT+  6X^)(x  + fix)  + (X  + 6X  )f  + X (^x  fix 


T . ,T.  - . T.  5f 


o t 


ftf  1 P.  TS2H  . , T 52H  , . T d2  II  r 

+ — 6u)  + - [fix  — - fix  + fix  r~r-  6u  + fiu  fix 

Su  2 3x2 


^xSu 


9uSx 


+ fiJMr  fiu]  + 6XA(f^6x  + ~ 6 u) } dt 


T.df  . . 9f 


du 


Sx w au 


(24) 


where  extensive  use  has  been  made  of  the  following  notational  substitutions: 


I is  the  identity  matrix 

A = Ml*  + K1 
{)  ax  x + 


at 


n . „ :tAi  + m~x  + Aii  + in 

( ) - X J1*1  sut  il  1 its*"  .,2 

ax  Bt 


and  all  derivatives  are  evaluated  on  the  reference  path. 


Using  (12),  (13),  (17), 


6u(t)  may  be  written  as 


6u(t)  = u(t)  - u(t) 


= £ a.  t1  - u(t) 
1=0 


+ 


Y bj  tj[6x(t)  +x(t)  - £ cR 
j=0  k=0 


] 


The  following  purely  symbolic  notations  are  introduced  for  convenience: 


i=0 


at 


lb3tj  " bt 
3=0 


k=0 


ct 


With  these  substitutions  6u(t)  may  be  written  as 

6u(t)  = at  - u(t)  + bt[6x(t)  + x(t)  - ct] 

6u(t)  from  (26)  may  be  substituted  into  (24),  giving  J as  a function  of  fi, 
a,  b,  c and  other  quantities.  A necessary  condition  for  optimal  choice  of  the 
unspecified  parameters  is  that  dJ  be  zero  for  arbitrary  first  order  changes  in 
the  parameters. 


By  virtue  of  the  optimality  of  the  reference  trajectory,  all  first  order 
terms  in  dJ,  and  the  8u(t^)  term  also,  dropout.  Thus,  dj  is  composed 
entirely  of  second  order  terms  and  by  a straightforward  development,  may  be 
written  as 


dJ  = e 


x dx  dx 
T 


j=i 

2 


j=l 


t=t. 


{[*nx  + fffi  + (tf)  + Ikj  *’(  *T  + + «[* 

j=l 


l k <*>>*]  ♦ d,T*}  dO*  tr{x[(|f)‘  * + 0Ti 


T , - . .T 


j 


t = t 


i=i  f 

p .1  J. 


V . j . T _ _ £ .2 

+ i'Jttif.  )[ni  + n^  k <*J)  +di/T*]}  do 

i=i  i=i  c~tt 

„rIrr.;T  .,t/m  si,.\  . t/32h  32h .t  s2h 

+ eJ  iLsx  +6x(st+^bt)  + 6x  (^•t^b,  + «bt' 

t dx 

o 


du  dx 


,T  d2  H , , \ . r _ x - ,.nT/d2H  Q2  H , . 

+ t bt 


+ (bt  )T  bt^  +[at  - u + bt(  x - ct ) ]T  ^ : 


su-  ' •BuSx  au2 


N„ 


+ 1 [6xTSTa7  + Cat ' “ + <T)f^]‘'d»1 


i=0 


i 

du‘ 
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2 2 
>x+x-ct)^6x^  + [at  - u + bt(6x  + x -ct)]F 

du 


+ (6xT L + -tT)“^Jt''db^  - ^[*>xT  + Cat  - u + bt(6x  + x - ct)j 


iT  92H 


+ (6xT  L + -tT)  |f  ]tkdck|  dt 


where,  by  definition,  tr  stands  for  trace  and 
X(t)  = e[5x(t)  6xT(t)] 

Setting  d J = 0 provides  necessary  conditions  for  extremizing  choices 
of  the  control  parameters.  The  Lagrange  multipliers  6X  satisfy 


' /df  9f  lA’  - /d  H , T d H dH 
6X  + (d^+  dubt)6X  + ( 2 +(bt)  dudx  + dxdu 


2 2 2 2 
d H ..  ..T  d H d H ,TdH,,\. 

~2  * (bt)  auix  * 3TJubt  * <bt)  71bt)6* 

dx  du 


2 2 

+ (^  + (bt)T  ^~|)Cat-u  + bt(x-ct)]  = 0 


2 T T 

a $ . , 


. ~t/m\  . 


- {[H+2(M)(H)nx+(t!)fix+nx(i)+nx'i:jix 


I k,(i£  nx)  (It  * *,nx)]6x  * °[*°xT  * 


T T • T P 


e$\  /e®\  \ T / u<r 

dx  / \ dx  i + Zj  jV  dx 

j=l  ' 


*inx)  *)]t(^+*nx)dl' 
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Because  flx(t)  is  a vector  of  random  variables,  6X(t)  is  also.  Neither  can 
be  used  computationally.  However,  it  may  be  verified  by  direct  substitution  that 

fiX(t)  - L(t)  6x(t)  + t-(t)  (30) 


where  L(t)  and  t( t ) are  the  same  for  every  member  of  the  ensemble. 

T 


; /Bf  Bf  . \*  T/3f  Bf  .A  l T B2  H 

L + ( t~  + r~  bt  L i-  L . + ^ bt ) +•  *"  ( bt ) 

\3x  Bu  / \3x  3 u / .2  3u3x 

3x 


2 2 
3 H . . M , T3  H , , 

+ t — : — bt  + (bt)  — — bt  = 0 


3x  Bu 


(31) 


Bu 


1 * (ti ♦ * «M>T7i)(at  - “ * bt(*  - ct>) " 0 <32) 

Su 

The  boundary  conditions  for  L and  l are  evident  by  inspection  of  (29). 

To  obtain  the  remainder  of  the  necessary  conditions  resulting  from  dJ  = 0, 
it  is  necessary  to  develop  the  differential  equations  for  £(  fix ) and  for  X. 

First,  it  may  be  noted  that  £(  fix ) appears  only  in  terms  that  are  second  order, 
hence  it  need  be  calculated  only  to  first  order.  The  linearized  perturbations  of 
(1)  with  6u(t)  from  (26)  immediately  give 

^e(fix)  = ^ bt)e(6x)  + [at  - u + bt(x  - ct)J  (33) 

dt'  \Bx  Bu/  du 


It  is  convenient  to  define 

fix  * C(fix)  + fix 


(34) 
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so  that 


x = e(6x)  e(6xT)  + x 

(35) 

x = e [fix  fixT] 

(36) 

Then,  by  direct  substitution 

x -(“♦“*)*  ♦ *(*♦ 

\dx  du  / \dx 

T 

H-) 

(37) 

The  boundary  conditions  for  6 (fix)  and  X are  given  by 

6[6x(to)]  , specified 

(38) 

e[6x(tQ)  6xT(to)]  = X(tQ), 

specified 

(39) 

i 2 ~ 

There  are  thus  2(n  +n)  differential  equations  for  C(6x),  X,  t , L and 

corresponding  boundary  conditions,  half  at  tQ  and  half  at  t^.  The  conditions 

at  tj  involve  the  Lagrange  multipliers  dv  and  k.;  constraint  equations 

(3)  and  (4)  furnish  the  additional  required  2p  relations. 

From  (16)  it  is  clear  that  is  a bias  in  the  choice  of  dt^.  Such  a bias 
gives  added  flexibility  because  the  differences  of  at  and  ct  from  u and  x 
respectively  cause  6[6x(t)]  to  be  non-zero.  Applying  dJ  = 0,  O maybe 
, explicitly  solved  for  in  terms  of  other  parameters  of  the  problem: 
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Thus,  ft  need  not  appear  as  an  unknown  in  any  numerical  optimization  pro- 
cedure. 


The  parameters  ft^  may  also  be  solved  for,  from  dJ  = 0,  in  terms  of 
quantities  evaluated  at  t = t^ : 


«x 


- < 


[ft  $ + ft  ^ k.(  0j)2  + di/T  4>]e(  6xT)X  + ( 

j=l 


dx  J 


d$  df. 
dx  dx 


W 

j=l 


;J»4i 

dx 


> (41) 


t*t 


f 


After  utilizing  (40)  and  (41),  the  unspecified  parameters  are  a,  b,  c.  These 
must  satisfy  integral  relations  which  result  from  dJ  = 0,  for  arbitrary  small 
changes  da,  db,  dc. 


- f & + Ut  - “ * bt<*  - ct)f  ~2  * ce(‘xT)L + <42) 


du 

i = 0,  1,  2,  - N 


u 


9 2 

I rp  rp  ^ 

0 = , J |[x  + (X  - ct)e(6xT)]^y^  + [x(bt)  + e(6x)[at  - u + bt(x  - ct)]  ~ 
*o 


+ (X  - ct)G(fixT)(bt)T  + (x  - ct)[at  - u + bt(x  - ct)]A] 


F,  a2  h 


du 


([X  + (x-ct)e(6xT)]L  + [e(6x)  + (x-ct)3tT)|f}t^  dt 

j = 0,  1,  2,  - -,  N 


(43) 
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0 


= J ^C(  6x~)  ^ ~ + [at  - u + bt(6(6x)  + x - ct)]T  — ~ + [£(6xT)L  + 


,T-idfl  ,k 


du 

k -0,1,2,  - -,  N 


dt 

(44) 


The  parameters  a,  b,  c may  not  be  eliminated  algebraically  because  other 
quantities  depend  on  them.  The  necessary  conditions  involving  £(6x),  X,  t, 
L,  a,  b,  c are  all  interlocked.  This  is  characteristic  of  dynamic  system 
optimization  problems  with  control  parameters.  Although  such  problems  are 
seldom  easy,  the  one  considered  here  presents  no  new  conceptual  difficulties. 


i 
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An  Alternative  Approach  to  the  Necessary  Conditions  Derivation 


This  analysis  is  based  on  second  order  expansions  and  is  closely  related 

to  the  second  variation  guidance  schemes  of  Refs.  1 and  2.  There,  for  a single 

perturbed  trajectory,  the  second  variation  of  the  performance  index  is  minimized 

subject  to  satisfaction  of  the  x = f and  ip  = 0 constraints.  One  proceeds  by 



making  stationary  the  function  4>  = <p  + V ip,  where  properly  chosen  V will 
lead  to  satisfaction  of  the  terminal  constraints.  The  second  variation  of  ♦, 
from  Refs.  1 and  2,  is 


J„  = 


fdxT  dx  + dxT  “ — — dt  + dt  T~r~  dx  + dt  ^ dt  1 

L . 2 3x  at  at  ax  ..  .2  j.  _ . 

3x  at  t - t 


f 


Jf[s/^8x.8xT^iL8u.4J^LSx.!u 


3x 


t a h 
au2 


dt 


Since  the  reference  path  satisfies  all  the  constraints,  it  is  sufficient  to  adjoin 
the  linearized  perturbation  constraints 


6x  = 


af 

ax 


fix 


3f  * 

+ r*  6u 

au 


d* . ri4*.  + |4dti  .0 

r Lax  at  j.  - 

1 f 

Then,  given  6x(tQ),  6u(t)  is  chosen  to  minimize  ^J2  while  satisfying 

constraints  (46)  and  (47).  This  leads  to  a linear  feedback  relation 


6u(t)  = - A ( t ) 6x(t) 

It  is  tacitly  assumed  that  x(t)  and  u(t)  as  well  as  A(t)  are  "stored" 
(available  to  the  guidance  system). 


The  significant  operational  simplification  of  neighboring  extremal  guidance 
introduced  in  this  paper  is  the  substitution  of  a relatively  small  number  of  poly- 
nomial coefficients  for  the  functions  u(t),  x(t),  A(t).  The  general  functions  of 
time  would  require  tables  of  values  vs.  time  in  operation  with  a digital  computer. 
Use  of  polynomial  coefficients  instead  may  be  expected  to  greatly  reduce  the 
storage  requirements. 

An  additional  advantage  of  the  polynomial  approximations  is  that  the  difficulty 
A(t)  -»  00  as  t -*  tj  disappears . The  polynomial  Eb^  t*  will  certainly  be  well 
behaved  in  the  neighborhood  of  t = tj.  Thus,  the  need  for  a transverse  state  com-  . 
parison,  so  important  for  neighboring  extremal  control,  may  become  less  sig- 
nificant in  analyses  conducted  along  the  present  lines. 


It  is,  of  course,  necessary  to  satisfy  the  constraint  (46)  in  any  (small  per- 
turbation) analysis.  It  is  not  possible,  however,  to  satisfy  (47)  for  arbitrary 
6x(tQ ) with  the  polynomial  approximations.  Hence,  the  use  of  a statistical  per- 
formance index  is  not  only  appropriate,  but  even  unavoidable.  The  alternative 
approach  to  the  derivation  of  the  previous  section  is  to  consider  minimizing  the 
ensemble  average  of  ~ J . Constraints  on  the  mean  and  variance  of  the  s 
[equations  (3)  and  (4)]  are  imposed.  Because  these  ensemble  averages  involve 
only  the  mean  and  covariance  of  6x(t),  it  is  sufficient  to  use  the  differential 
equations  for  G(  6x)  and  X in  place  of  (46).  Thus,  (24)  is  fully  equivalent  to 


= j2]  + di/Te{d^U(tp,tf]}  + ^|kje[d0jU(tf),tf3}2  + J + -^bt)e(6x) 

j=l  O 

. [at  -1  . bt(x  - ct>]  - £*(#*>]  * tr  l[(^  ♦ M)X  ♦ §*  bt)T  - x]}  <* 


(49) 
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Here  t(t)  and  L(t)  appear  as  a vector  and  matrix  respectively  of  Lagrange 
multiplier  functions . * -t-(t)  is  the  vector  adjoint  to  66[fix(t)],  L(t)  is  the 
matrix  adjoint  to  6X(t).  All  the  necessary  conditions  of  the  previous  section 
may  be  obtained  by  requiring  J of  (49)  to  be  stationary  with  respect  to  arbitrary 
small  changes  in  the  unspecified  parameters. 


* Since  L multiplies  symmetric  matrices  in  (49),  it  may  be  assumed  symmetric 
with  no  loss  of  generality. 
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Possible  Additional  Complexities 


The  analysis  as  presented  allows  disturbances  only  in  the  form  of  pertur- 
bations in  the  initial  state  variables.  It  also  assumes  that  the  state  is  known 
perfectly  at  all  times.  Both  restrictions  may  be  relaxed  while  still  retaining 
the  polynomial  approximation  approach. 

Disturbing  influences  may  arise  from  perturbations  of  system  parameters 
from  their  reference  values.  For  example,  the  thrust  and/or  fuel  consumption 
rate  of  a rocket  vehicle  may  deviate  from  its  pre-planned  value.  To  allow  for 
this  in  the  analysis  presented  here,  such  system  parameters  may  be  regarded 
as  state  variables  with  zero  time  derivatives.  Thus,  a parameter  deviation 
becomes  an  initial  state  variable  perturbation. 

Time-dependent  random  forcing  functions  may  be  added  to  the  analysis  if 
their  means  and  covariances  are  known,  although  serious  complications  may 
arise  if  the  noise  is  appreciably  correlated  in  time.  The  main  effect  with  zero- 
mean  white  noise  would  be  to  add  a term  to  X . The  other  equations  would  be 
unaltered,  but  any  numerical  solution  might  be  substantially  different. 

If  state  estimation  errors  were  not  considered  negligible,  it  would  be  pos- 
sible to  include  them  by  considering  the  estimator  characteristics.  A linear 
perturbation  estimator  would  be  consistent  with  the  degree  of  approximation 
used  here.  The  estimator  gain  matrix  would  play  a role  analogous  to  the  feed- 
back gain  matrix.  It  would  be  approximated  by  a polynomial  analogous  to  Lb^  tJ . 
The  polynomial  coefficients  would  be  added  to  the  others,  all  to  be  chosen 
simultaneously  to  optimize  the  system  ensemble  average  performance. 


Computational  Considerations 


The  preceding  analysis  has  been  devoted  to  problem  formulation  and 
development  of  first  order  necessary  conditions  for  a minimum.  Computa- 
tional determination  of  the  control  parameters  which  actually  furnish  a mini- 
mum represents  a second  phase  of  study.  It  is  clear,  however,  that  any  of  the 
methods  applicable  to  the  solution  of  Mayer/Bolza  variational  problems  appear 
likely  to  be  equally  suitable  to  parameter  optimization  problems  of  the  present 
type.  On  the  basis  of  experience,  the  writers  are  favorably  inclined  toward 
the  use  of  gradient  methods  (Refs.  7 and  8)  and  methods  of  the  second  variation 
type  (Ref.  9),  and  in  this  connection  it  should  be  noted  that  the  usual  require- 
ment for  rapid  access  storage  of  control  variables  versus  time  is  eased  in  favor 
of  a somewhat  less  severe  requirement  for  storage  of  parameter  values.  With 
the  second  order  method  of  Ref.  9,  it  appears  that  parameter  optimization  will 
entail  the  solution  of  fairly  large  linear  algebraic  systems,  and  hence  that 
greater  attention  than  usual  must  be  given  to  error  propagation  problems. 


Concluding  Remarks 

The  present  paper  has  sketched  in  some  detail  an  ensemble  averaging 
approach  to  optimal  guidance  polynomial  approximations.  Conclusions  on  the 
merits  of  this  approach  must  be  deferred  until  numerical  examples  of  syn- 
thesis procedure  have  been  worked  and  system  simulations  performed.  In 
connection  with  the  problem  of  guidance  system  mechanization,  it  will  be  of 
interest  to  investigate  the  use  of  transverse  state  comparison  or  some  similar 
mode  of  comparison  employing  polynomial  representation. 
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sl>  82>  *3  Scaling  parameters 

A0  Transfer  angle  (true  anomaly  difference  in  transfer  orbit 

plane) 

p.  Gravitation  constant  (95634.50100  mi^/sec2) 

01  Angle  from  reference  axis  to  departure  position  in  initial 
orbit 

02  Angle  from  reference  axis  to  arrival  position  in  terminal 
orbit 

cu  Argument  of  perigee,  angle  from  reference  axis  to  perigee 

point 

Vectors 

e Orbit  shape  and  orientation  vector 

g.  Unit  vector  in  gradient  direction 

I Impulse  vector 

N Unit  vector  denoting  reference  direction  (line  of  intersection 

of  initial  and  final  orbit  planes) 

r Geocentric  satellite  position  vector 

Ul  Unit  vector  directed  toward  point  of  departure  from  initial 

orbit 
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Unit  vector  directed  toward  point  of  arrival  in  final  orbit 

V 

Velocity  vector 

W 

Unit  vector  directed  along  orbit's  angular  momentum  vector 

Subs c riots 

i 

Initial  orbit 

2 

Final  orbit 

t 

Transfer  orbit 

tl 

transfer  orbit  departure  point 

t2 

transfer  orbit  arrival  point 

4 
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I.  INTRODUCTION 

(12  3) 

In  previous  papers,  ’ ’ the  authors  discussed  the  properties 

of  function  spaces  associated  with  optimum  two-impulse  transfer  between  in- 
clined elliptical  orbits.  An  impulse  function  contouring  technique  which 
presented  the  nature  and  structure  of  the  entire  function  space  was  utilized 
to  identify  all  possible  regions  of  a given  function  which  would  yield 
optimum  transfer  orbits. 

Contouring  proves  adequate  for  locating  minima,  and  for  providing 
insight,  but  it  generally  does  not  provide  required  numerical  accuracy.  This 
is  true  for  many  of  the  most  interesting  orbit  pairs  wherein  the  difficult 
phase  of  numerical  optimization  occurs  during  the  final  convergence.  These 
particular  functions  are  comprised  of  long,  narrow  "valleys"  containing  one 
or  more  minima.  It  is  therefore  necessary  to  employ  an  alternate  technique 
to  compute  precise  optimum  orbital  transfer  circumstances  for  use  in  engineer- 
ing design  studies. 

Experience  with  ordinary  steepest  descent  processes  ^ led  to 
numerous  frustrations  and  amplified  the  need  for  the.  more  powerful  adaptive 
steepest  descent  technique  presented  here.  This  rapid  numerical  method  has 
been  applied  successfully  to  the  minimization  of  numerous  different  orbital 
transfer  function  spaces.  The  method  also  has  obvious  application  to  a large 
class  of  problems  which  require  numerical  determination  of  the  extrema  of 
a function  of  3 or  more  variables. 


NORTH  AMERICAN  AVIATION,  INC. 


SPACE  and  INFORMATION  SYSTEMS  DIVISION 


IX.  TWO-IMPULSE.  ORBITAL  TRANSFER  FORMULATION 

Adopting  the  notation  of  Ref.  1,  consider  a two— impulse  transfer 
process  between  an  initial  orbit  with  elements  p^_,  e^,  i and  a final 

orbit  defined  by  p^>  e^,  The  formulation  assumes  Keplerian  orbits  and 

results  from  choosing  the  final  orbit  as  the  reference  plane;  i is  the  rela- 
tive inclination  of  the  two  orbit  planes  (cos  i = • W2,  where  W-,  ar.d  W2 

are  unit  vectors  directed  along  the  angular  momentum  vectors  of  the  initial 
and  final  orbits).  For  coplanar  orbits,  the  reference  direction  (N)  is  arbi- 
trary, but  for  inclined  orbits  N is  defined  as  the  line  of  intersection  of 
the  two  orbit  planes  (N  = W2  x W-^  / | W^  x | ). 

For  the  general  case,  there  is  a three-parameter  family  of  trans- 
fer orbits  joining  any  two  specific  orbits.  The  angles  from  the  reference 
lire  to  departure  point  (0^)  and  to  arrival  point  (02)  are  a natural  choice 
for  two  of  the  three  independent  variables,  since  they,  along  with  the  given 
orbital  elements,  specify  position  and  velocity  in  the  known  orbits  (Fig.  1). 
The  semilatus  rectum  (p^)  of  the  transfer  orbit  was  the  third  parameter  used 
for  this  study.  It  was  chosen  since  it  simplified  the  structure  of  the  im- 
pulse function,  I (0  , 0 , p ).  ^5) 

1.  /d  T* 


TRANSFER  GEOMETRY 

Unit  vectors  (U  and  U ) and  radius  vectors  (r-.  and  r0)  toward 
—1  — 2 —x  — 

the  departure  and  arrival  points  may  be  computed  from  0-^,  02  and  the  elements 
of  the  initial  and  final  orbits:* 


^1  = 


[cos  0^,  sin  0^  cos  i,  sin  0^  sin  i J 


(1) 


*The  subscripts  1,  2,  and  t denote  initial,  final  and  transfer  orbits. 
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tl 

cy 

->l 

[cos  02,  sin  02,  oj 

(2) 

= 

r pi 

_1 

u. 

j = 1,  2 

(3) 

[ 1 + ej  cos  (0j  - 

“3>J 

-j 

Unit  vectors 

normal  to  the  three 

orbit 

planes 

are  defined  as  follows: 

H 

[ 0,  - sin  i,  cos  i 

(4) 

II 

*7. 

[o,  0,  l] 

(5) 

Ht  = 

% x U2  / | U-l  x 

Sal 

^ x U2  0 0 

(6) 

Two  vectors  that  define  the  shape  and  orientation  of  the  initial 
and  final  orbits  complete  the  transfer  geometry:  ^ 

e.  = e,  [ cos  a).,  sin  u)  cos  i,,  sin  <u  sin  i 1 j = 1,  2 (7) 

-j  J L y i i j jJ 

The  true  anomaly  interval  traversed  in  the  transfer  orbit  ( AO)  may  be 
determined  directly: 

cos  AO  = (U  • U ) 0°  < Ae  < 180°  (8) 

-L  ^ 

No  generality  is  lost  if  the  true  anomaly  interval  is  limited 
to  the  first  two  quadrants.  Although  this  does  restrict  the  problem  to 
"short  transfers",  if  the  signs  of  the  velocity  vectors  in  the  transfer 
orbit  are  changed,  the  "long  transfers"  may  be  computed.  Thus,  in  order  to 
determine  the  absolute  minimum  impulse  transfer  between  two  elliptical  orbits, 
it  is  necessary  to  compare  the  minima  found  from  all  the  short  transfers  and 
all  the  long  transfers. 
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IMPULSE  COMPUTATION 


The  function  to  be  minimized  is  the  total  impulse  for  the  two- 


impulse  maneuver: 


i - Uil  + u2 


where 


Jl.  - ±2ti  - h 


h ' % T ^2 


(When  a double  sign  is  used,  the  upper  sign  refers  to  a "short  transfer"). 

Velocity  vectors  in  the  initial  and  final  orbits  at  the  departure 
and  arrival  points  (V^  and  V^)  and  the  corresponding  velocity  vectors  in  the 
transfer  orbit  (V^  and  V^)  are  computed  as  follows: 


lj  = JTI  \ x (e  4-  U1) 


f Si  1 + 


p£  (z  + *u,) 

V pt  l 


it.  (v  - zUj 

n 
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» 


where, 

- = (-2  - V]  / \\  x %\  <l6> 


* = tan  ( A 0 / 2)  (]7) 

Pt 

Eqs.  12  - 17  may  be  derived  from  Eq.  3.26  of  Herget . ^ ^ The  final  impulse 
equations  are  obtained  from  Eqs.  10  - 17  by  substituting  Eq.  6 and  perform- 
ing several  algebraic  manipulations: 


h.  = ± [s  + 2%]  - Si 

k - h T [ I - «2l 


(18) 

(19) 


Impulses  corresponding  to  long  and  short  transfers  are  compared, 
and  the  combination  producing  the  lesser  impulse  is  used  for  the  remaining 
computations.  Because  of  the  nature  of  the  particular  functions  being  ana- 
lyzed, regions  neighboring  each  local  minimum  are  usually  comprised  entirely 
of  either  long  or  short  transfers. 

IMPULSE  MINIMIZATION 

Minimization  of  Eq.  9 by  a steepest  descent  technique  requires 
computation  of  the  gradient.  Upon  differentiation,  Eq.  9 provides  the 
following  expressions: 


dl 


(fc  • di2> 

l£>l 


(20) 
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or. 


d I 

u • - 

aji  \ 
apt  / 

{h 

* 

al2\ 
a pt  / 

a Ft 

Oil 

“ + 

1 

I2| 

ai 

u • ■ 

a Ii  \ 

-+  ^ 

• 

al2\ 

/ 

a 

141 

+ 1 

i2| 

1 

a i 

(4  • ' 

d4\ 

TFj 

• 

dh\ 

-TTz) 

~w2 

i^ii 

-+ 

+ 1 

i2 

1 

The  above  expressions  may  be  expanded  as  follows: 


a lx 

+ avtl 

ali 

apt 

a pt 

a pt 

a i2 

= + 

dl2 

a Pt 

d*t 

a Ii 

_ + a St! 

a Si 

7^ 

" sjt  - 

a0x 

all 

+ a In 

ali 

jt2 

a 02  " 

a 02 

a I2 

, avt2 

a Is 

7T 

- * a^"  + 

a K 

(21) 

(22) 

(23) 

(24) 

(25) 

(26) 

(27) 

(28) 
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d-2  = - dlt2  + £ —2 

a02  "a^T 


(29) 


Noting  that  aX.]/  ap^  and  j V^/  a are  ®&ch  zero,  a simplified 
expression  for  d I / d p^  may  be  obtained  through  several  algebraic 
manipulations; 

ai  = ± *_i_  ['ll  • (v  - zUj)  _ I2  . (v  ^ zu2)  ] (30) 


„ = ± _i_  r £i  • (l  ~ z2i)  I2  • (v  t-  zu2) 

't  2pt  L I -1 1 I -2  I 


a p. 


Additional  expressions  are  obtained  from  eqs.  26  - 29  by  direct  differentiation 

of  the  vector  equations. 

_ / 
a 0i  ^ i 


(It 

M2  r 

d esc  Ao  _ 

esc  a o a ri  1 

K 

L 

a0x 

rl  dh  \ 

pt 

) [ 

U,  a esc  A 

0 t esc  A 0 _c 

r2 

/ [ 

. a <>i 

, ( 

jAI 

anj 


- U-L  a cot  A 0 _ cot  A 0 d X; 


a 0 


-i  1 

Kf 


aii  _ 

a 0X 

altl  = 

717 


p -sin  0^^  cos  i,  -sin  01  sin  ij 

a esc  A o +-  esc  Ao  a -2  1 


(31) 

(32) 


a 0, 


esc  A 0 t-  pt  CSC  A 0 a r2 

a 


a 0S 


a cot  A o 

a 0O 


]} 


*h-  = 

tt2 


(33) 

(34) 
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*1 12 

d 01 


fT~  f-  pt  r U,  d CSC  A 0 + 

\ l'1  ifi — 

+ U„  I / 1 - pt  \ d esc  L 

-2Ll  Tx  ) 

+ d cot  A 0 1 1 

J; 


esc  A 0 d — - 

d 


±h  1 

d . 


A 0 - pt  esc  A 0 d rl 

d 


rl2 


dl2 

«2t2 

d 0o 


= 0 


(35) 

(36) 


4% 


p2 


1*  - d esc  A 0 +1  esc  A 0 

a r2l 

L d 02  r2 

a 02J 

U„  d esc  A 0 + esc  AO  <3—2 

d 02  a 02  . 

+ cot  A © d -2  \ 

(37) 

J 

-sin  02  , o] 

(38) 

terms  in  Eqs.  31  - 38  may  be  computed  from 

the  following  expressions: 

d ~1  = [-sin  0-^  , cos  0^  cos  i , cos  0^  sin  ij 

d 

J.  ~2  = [-sin  02  , cos  02  , ol 

J 

ari  _ r1g9lrin(0I  - <■,,) 

<3  01  ' Px 

ar2  = r2  e2sin  (02  “ 

""^2  ^2 

<3  esc  A 0 = - esc  Ae  cot  AO  3 A 0 


TJ 


i 


1 = 1,  2 


(39) 

(40) 

(41) 

(42) 

(43) 


4 


« 
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d cot  A 0 = - CSC2  AO  a A 0 j = 1,  2 (44) 

»*) 

The  following  convenient  expression  for  A 0 allows  computation  of 
the  remaining  derivatives: 


a A o 

~Wi 


d A 0 


- (-sin  02.  cos  0r 


1$. 


Vi  - (cos  02. 
- (-cos  02_  sin 


yr  - (cos  02. 


sin  0^  sin  02  cos  i) 
cos  02_  sin  02  cos  i) 

+ sin  01  sin  02  cos  i)2 
sin  02_  cos  02  cos  i) 

+ sin  02.  sin  02  cos  i) 


(45) 

(46) 

(47) 


A 
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III.  ADAPTIVE  STEEPEST  DESCENT 

Since  setting  Eqs.  21,  22,  and  23  equal  to  zero  yields  no  general 
analytical  solution,  one  is  faced  with  numerically  solving  an  ordinary 
calculus  problem  requiring  the  minimization  of  a function  of  3 variables. 

Successful  use  of  a numerical  search  which  stepped  in  the  negative 
gradient  direction  was  reported  in  Refs,  k,  7,  8,  and  9.  However,  this 
procedure  proved  to  be  inadequate  for  the  more  sensitive  function  spaces. 
Attempts  to  employ  Newton-Raphson  methods  were  similarly  frustrated  by  the 
nature,  structure,  and  multiplicity  of  minima  of  typical  impulse  function 
spaces. 


The  present  "adaptive  steepest  descent"  procedure  effectively 
overcomes  the  convergence  and  accuracy  limitations  of  the  previous  methods. 
A numerical  search  employing  Eqs.  1 - 47  is  terminated  when  the  following 
necessary  conditions  for  a local  minimum  have  been  achieved: 


ai 

dh 


< 


a i 
a 02 


< € 


a i 

— p“  < €,  € <<  0 

° t 


(48) 


During  the  n'th  step  of  the  search  the  gradient  vector  is  computed 
and  the  n t-  l'st  coordinate  vector  is  determined  as  follows: 

V 


h 

*2 


nH 


- a 


n 


? 0 

3j 

0 Sj  0 


0 0 


*1 

S2 

g3 

J 

_ 

j = 1,  2,  or  3 


(49) 


where  a is  the  current  magnitude  of  the  step  size,  the  8j  are  variable 
scaling  parameters,  and  , g2  » and  are  the  components  of  a unit  vector 
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in  the  gradient  direction.  Note  also  that  the  scaling  matrix  is  normalized 
relative  to  one  of  the  scaling  parameters.  Eq.  49  is  employed  to  construct  a 
sequence  of  points  at  which  the  impulse  function,  I (0-^,  02  pt),  is  evaluated. 
An  additional  constraint  on  the  process  requires  that  the  sequence  of 
impulses  { I (0p  02,  P^.)n  } be  monotone  decreasing. 

The  control  logic  for  the  optimization  process  is  rather  simple: 

(1)  If  the  inequality 

1 02  Pt}  n + 1 < 1 (01>  02’  Pt>n  (50) 

is  not  satisfied,  a is  decreased  and  a new  coordinate  vector 

(01,  02*  pt^  n + 1 is  comPuted*  Thus,  the  n'th  stage  of  the 
process  is  repeated  until  Eq.  50  is  satisfied  or  a < « , e <<  0 . 

(2)  Similarly,  a is  decreased  if  Eq.  50  is  satisfied  during  each 

of  a successive  number  of  steps. 

(3)  The  scaling  parameters  (s^)  are  decreased  each  time  a 
corresponding  component  of  the  gradient  vector  changes  sign. 

The  process  control  philosophy  is  clearly  an  unsophisticated  trial 
and  error  learning  procedure.  It  does,  however,  provide  a rapid  and  reliable 
method  of  handling  the  inevitable  scaling  problems  associated  with  steepest 
descent  or  gradient  methods.  While  it  is  true  that  more  exact  methods  for 
determining  the  scaling  matrix  are  available,  such  methods  involve  analytical 
or  numerical  evaluation  of  higher  derivatives^0^.  For  the  class  of  functions 
treated  here,  it  is  not  clear  that  this  additional  sophistication  is  worth 
the  cost  (analytical  and  programming)  of  implementation.  The  fact  that  only 
a few  seconds  of  IBM  7094  time  is  required  to  determine  a typical  local 
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m-in-imum  is  offered  as  further  testimony  to  the  practicality  of  this 
simplified  scaling  procedure. 

CONVERGENCE  PROPERTIES 

An  impulse  function  space  associated  with  a pair  of  inclined 
elliptical  orbits,  (p^  = 5000  mi,  P2  = 6000  mi,  e-^  = 62  = 0.2, 

= -900,  u>2  = +30°,  i = 5°),  was  investigated  with  an  IBM  7094  double 
precision  program  incorporating  the  adaptive  steepest  descent  technique.  This 
particular  function  space  was  previously  studies  in  Ref.  1 by  generating  an 
optimum  impulse  contour  map  (Fig  l).  The  descent  paths  associated  with  a 
number  of  starting  points  have  been  plotted  in  Fig.  1.  This  function  space 
offers  no  significant  problems  and  the  four  minima  predicted  by  contouring  are 
quickly  established  with  required  accuracy  (13  significant  figures)  regardless 
of  the  particular  descent  path.  Table  1 contains  the  parameters  associated 
with  each  minimum  as  well  as  the  computer  time  required  for  the  shorter 
descent  paths. 


Table  1 - Optimum  Transfer  Parameters 


Initial  Orbit 
Final  Orbit 

px  = 5000. 
p2  = 6000. 

mi  e^  = 0.; 
mi  — 0.: 

= -90.  °0  i = 
cu2  = +30.  °0 

5.° 

Optimum 

Deg. 

02>  De8- 

Impulse,  fps. 

7094  time, 
sec. 

1 

73.8152 

187-5568 

6644.8496 

4902.65122  3852 

3-4 

2 

40.8343 

298.2634 

6617.7904 

5343.14869  3477 

2.5 

3 

177.8114 

73.6465 

4611.8023 

5393.78114  4757 

3.0 

4 

308.2034 

37.7403 

4592.8574 

5654.19120  9679 

2.8 
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Figure  1 - Descent  Paths  Plotted  on  Optimum-Impulse  Contour  Map 


NORTH  AMERICAN  AVIATION,  INC. 


SPACE  and  INFORMATION  SYSTEMS  DIVISION 


Figs.  2 and  3 illustrate  typical  behavior  of  the  various  control 
and  function  parameters  during  optimization  of  a function  having  long  narrow 
features  similar  to  those  appearing  in  Fig.  4-  In  order  to  minimize  impulse 
in  this  example  the  program  must  search  down  a long  narrow  "tube"  whose 
principal  axis  extends  approximately  in  the  0-^  direction.  The  large  initial 
increase  in  the  0p  scale  factor  ( sp)  allows  a large  0p  correction  to  be 
accomplished  early  in  the  optimization  (Fig.  2).  Near  the  minimum  the  "tube" 
becomes  more  nearly  "disc"  shaped.  The  scaling  parameters  stablize  and 
maintain  their  general  "disc"  shape  as  the  number  of  steps  exceeds  500. 

Fig.  3 illustrates  convergence  of  the  coordinate  vector's  components 
(solid  lines).  Note  the  large  changes  in  0p,  corresponding  to  the  maximum 
values  of  sp  appearing  in  Fig.  2.  An  additional  case  involving  ordinary 
steepest  descent  optimization  (i.e.,  Sp  = Sg  = s^  = l)  is  presented  for 
comparison  (broken  lines).  Under  this  constraint  the  descent  process  locates 
the  center  of  the  "tube"  and  then  begins  a very  slow  movement  in  the  0p 
direction.  Impulse  convergence  for  these  two  cases  is  also  illustrated  in 
Fig.  2.  Note  that  the  adaptive  method  continues  to  minimize  long  after  the 
ordinary  steepest  descent  process  has  essentially  ceased  optimization. 

Although  the  convergence  of  the  adaptive  method  appears  to  be  quite 
slow,  it  should  be  remembered  that  this  particular  function  was  chosen  for 
its  difficulty.  Fig.  2 also  includes  data  for  an  optimization  involving  the 
inclined  elliptical  orbits  which  produced  Fig.  1.  For  this  optimization  the 
impulse  error  (In  - 1^  ) decreases  from  lcA  to  10”^  in  only  40  steps. 

Clearly,  the  method  quickly  adapts  to  the  structure  of  any  function  and  then 
proceeds  to  make  good  progress  toward  the  local  minimum  of  interest. 
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IV.  NUMERICAL  RESULTS 
"AIMOST  TANGENT"  ORBITS 

Several  authors  have  established  the  non-optimality  of  ordinary 
co-tangential  transfers  between  elliptical  orbits  and  one-impulse  transfers 
at  a point  of  tangency.  However,  one  easily  observes  that 

optimum  transfer  orbits  usually  are  nearly  tangent  to  both  the  initial  and 
final  orbits.  This  fact  and  certain  other  questions  generated  during  prior 
studies  by  function  contouring  2>  3)  made  the  class  of  "almost  tangent" 
orbits  an  interesting  candidate  for  further  numerical  investigation.  The 
existence  of  two  locally  optimum  transfers  between  tangent  orbits  was 
demonstrated  in  Ref.  1.  Further  investigation  using  the  adaptive  steepest 
descent  program  has  established  the  existence  of  at  least  three  (3)  local 
minima  in  the  function  spaces  associated  with  a large  class  of  "almost 
tangent"  orbits. 

In  Fig.  4,  two  optimum  impulse  contour  maps  for  a pair  of  tangent 
orbits  (Px  = 5000  mi,  P2  = 6000  mi,  ex  = e2  = 0.2,  Aw  = -53?  1301)  are 
presented  in  order  to  adequately  display  the  long  narrow  "valleys"  which  are 
characteristic  of  this  class  of  function  spaces.  Note  that  the  scales  are 
greatly  distorted  to  amplify  certain  details  and  to  allow  the  use  of  a small 
contour  interval  ( A I = 0.01  fps). 

3y  constraining  the  numerical  search  to  planes  normal  to  the  axes 
of  the  various  valleys  one  may  develop  a complete  picture  of  the  optimum 
regions  of  a given  function  space.  In  Fig.  5 impulse  is  plotted  as  a function 
of  position  throughout  the  space  by  first  traversing  the  horizontal  valley 
(02  « 71°)  and  then  traversing  the  vertical  valley  (0j_  « 71°).  In  Fig. 

5b  a number  of  points  (a  - f)  are  plotted  on  the  curve  for  tangent  orbits.  These 
same  points  are  reproduced  in  Fig.  4 to  allow  matching  of  the  various 
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c)  DEEPLY  INTERSECTING  ORBITS 
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structural  features  with  corresponding  values  of  impulse. 

The  curves  appearing  in  5b  were  obtained  by  rotating  the  tangent 
orbits  of  Fig.  4 from  a nonintersecting  orientation  ( Aou  = -53- 0 10)  to  a 
slightly  intersecting  orientation  ( Auj = -53-°  17)»  Several  of  these 
curves  exhibit  three  local  minima.  Although  the  impulse  difference  between 
mininw  is  slight  and  not  usually  important  in  the  engineering  sense,  it  is 
necessary  to  isolate  the  absolute  minimum  for  valid  comparisons  with  finite 
thrust  maneuvers  such  as  the  Lawden  Spiral. 

Also  appearing  in  Fig.  5b  are  essentially  straight  lines  corres- 
ponding to  one-impulse  transfer  maneuvers  performed  at  the  intersection  point 
of  smallest  radius.  Contensou^1^  and  Breakwell^1^  have  each  demonstrated 
the  existence  of  such  optimal  one-impulse  transfer  maneuvers.  The  problem 
of  finding  these  one-impulse  maneuvers  is  discussed  in  Refs.  17  and  18  which 
develop  formulae  for  predicting  the  range  of  orbit  parameters  for  which  the 
one-impulse  maneuver  is  optimum.  Figs.  5a  and  5c  illustrate  the  effect  of 
large  rotations  from  a tangency  condition.  Note  that  three  local  minima 
persist  in  Fig.  5a  although  the  orbits  are  far  from  intersecting.  If 
intersection  deepens  (Fig.  5c)  the  function  space  again  begins  to  have  small 
regions  denoting  two-impulse  maneuvers  which  require  less  impulse  than  the 
associated  one— impulse  maneuver.  Fig.  6 further  clarifies  this  relationship 
by  plotting  optimum  impulse  for  both  the  one  and  two-impulse  maneuvers . The 
two  curves  are  seen  to  coincide  over  a small  range  of  relative  orientation. 
INCLINED  ELLIPTICAL  ORBITS 

The  existence  of  optimal  coplanar  orbital  transfer  maneuvers 
requiring  no  more  than  two  impulses  is  discussed  by  Contensou^1^  and 
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Figure  6 - Impulse  Comparison  for  Optimum  One  and  Two-Impulse  Transfers 
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Breakwell.  Extensive  investigation  using  the  adaptive  steepest  descent 

program  strongly  suggests  that  optimal  transfer  between  classes  of  inclined, 
non-coapsidal,  elliptical  orbits  also  requires  a maximum  of  two  impulses. 

It  follows  that  optimal  one-impulse  maneuvers  between  inclined  elliptical 
orbits  must  also  exist. 

LAWDEN  SPIRAL  VS.  TWO  IMPULSE  TRANSFER 

In  Ref.  19  Lawden  discusses  the  possible  optimality  of  a particular 

intermediate  thrust  spiral  trajectory.  Using  a contouring  technique  the 

authors  of  this  paper  demonstrated  the  existence  of  optimum  two-impulse 

transfer  maneuvers  which  require  less  total  A V than  the  corresponding 

(20  21 ) 

Lawden  spiral  maneuvers  9 . Using  the  adaptive  steepest  descent  program, 

these  numerical  results  have  now  been  expanded  to  give  a broad  comparison  of 
the  two-impulse  maneuver  and  the  Lawden  spiral. 

The  orbits  which  oscillate  to  the  Lawden  spiral  are  generated  by 
varying  the  parameter  sin2  \p  which  denotes  position  on  the  spiral.  In 
Fig.  7 the  difference  in  velocity  change  required  for  both  maneuvers 
( A - ^V2-imp^  plotted  as  a function  of  position  difference 

between  the  osculation  points.  A family  of  curves  was  generated  by  varying 
sin2  *//  of  the  initial  orbit. 

In  all  cases  computed  a two-impulse  maneuver  which  required  less 
AV  than  the  Lawden  spiral  was  found.  Numerical  accuracy  limitations 
prevented  extending  these  comparisons  to  smaller  values  of  A sin2  \f/  . 
Interestingly  enough,  all  the  curves  presented  indicate  that  the  difference 

p 

in  velocity  change  increases  as  the  4-7  power  of  A sin  \f/  , which  leads  to 

2 

a severe  departure  from  the  Lawden  spiral  A V as  A sin  \j/  increases. 
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Figure  7 - Lawden  Spiral  AV  Compared  to  Optimum  Two-Impulse  A V 
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V.  CONCLUSION 

An  effective  numerical  method  for  precise  computation  of  optimum 

two-impulse  transfers  between  inclined  elliptical  orbits  has  been  developed 

and  verified.  When  supplemented  by  previously  developed  function  mapping 

(l  3) 

techniques,  9 the  adaptive  steepest  descent  program  has  successfully 
minimized  the  most  difficult  function  spaces  encountered.  The  complexity 
of  the  more  interesting  function  spaces  suggests  that  considerable  caution 
should  be  exercised  when  numerically  seeking  the  absolute  minimum  two-impulse 
transfer. 

In  view  of  the  demonstrated  optimality  of  the  two-impulse  maneuver 
for  transfering  between  a large  class  of  orbits,  this  proven  numerical 
optimization  program  becomes  a valuable  tool  for  use  in  numerous  research 
and  engineering  studies. 
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Numerical  and  analytical  results  concerning  optimum  one-impulse  orbital  transfer 
maneuvers  are  presented.  By  considering  & class  of  " shallowly  intersecting"  coplanar 
orbits  which  may  be  produced  by  differentially  changing  the  orbital  elements  of  a pair 
of  tangent  orbits,  one  may  derive  a number  of  approximate  expressions  concerning  the 
minima  of  the  one-impulse  maneuvers  that  occur.  Numerical  comparisons  of  one-impulse 
transfers  and  corresponding  optimum  two-impulse  and  optimum  180°  two-impulse  transfers 
were  made.  These  comparisons  suggested  that  there  exists  a narrow  range  of  shapes  over 
which  one-impulse  transfer  is  optimal  and  indicated  analytical  expressions  bounding  the 
region  for  the  equivalence  of  one-impulse  transfer  and  optimum  180  degree  two- impulse 
transfer.  Simple  exact  equations  which  define  outer  bounds  to  the  range  of  shapes 
over  which  one-impulse  may  be  optimal  were  then  derived.  Straight  forward  evaluation 
of  these  expressions  immediately  establishes  the  non-optimality  of  a one-impulse  transfer 
between  any  given  pair  of  coplanar  orbits. 
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P 
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w 

Vectors 
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Subscripts 
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Semimajor  axis 
Eccentricity 
True  anomaly 

True  anomaly  half  angle  between  intersection  points 

Inclination 

Impulse 

Gravitation  constant 
Semilatus  rectum 
V P2/Pi 

Angle  denoting  true  anomaly  of  line  which  bisects  angle 
between  intersection  points 

Argument  of  perigee  (initial  orbit  relative  to  final  orbit) 

Unit  vector  in  perigee  direction 
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I.  INTRODUCTION 

During  the  course  of  a continuing  study  of  optimum  orbital 
transfer  maneuvers  (Refs.  1,  2,  3,  4,  5)  the  class  of  "shallowly  intersecting" 
orbit  pairs  was  shown  to  be  worthy  of  further  study.  For  such  orbits, 
numerical  data  indicated  the  existence  of  one-impulse  orbital  transfer 
maneuvers  which  resulted  in  minimum  fuel  expenditure;  a result  which  has 
been  discussed  by  Contensou^,  and  Breakwell^.  If  one  must  find  the 
optimum  transfer  between  a pair  of  non-coapsidal,  "shallowly  intersecting," 
coplanar  elliptical  orbits,  it  is  clearly  desirable  to  determine  if  a 
one-impulse  maneuver  is  optimal  before  proceeding  with  two-impulse 
optimization  techniques  such  as  those  described  in  Refs.,  4 and  5-  Furthermore, 
it  is  known  that  the  impulse  function  spaces  associated  with  such  orbit  pairs 
offer  a formidable  and  time  consuming  challenge  to  numerical  optimization 

techniques. This  is  largely  because  these  function  spaces  are  structured 

* ( 5 ) 

in  the  form  of  long  narrow  "valleys"  containing  several  minima.  J ' 

Therefore,  a strong  motivation  for  developing  formulae  for  predicting  and 

evaluating  these  favorable  orbital  transfer  maneuvers  exists. 
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II.  GEOMETRY  OF  SHALLOW  INTERSECTIONS 

Consider  two  coplanar,  non-coapsidal,  elliptical  orbits  that  are 
nearly  tangent  and  are  described  by  the  elements:  p^,  p2  = 

Q 

where  p = P2/P1  > I>  ^ 0 ®2  ^ w2  = and  = w ^ 0.  For 
P1  = p2  the  orbit  intersections  must  lie  180  degrees  apart  and 
this  case  is  therefore  excluded  because  a shallow  intersection  is  to  be 
characterized  by  a small  true  anomaly  interval  between  the  two  points  of 
intersection.  Finally,  one  may  restrict  u>  to  the  range,  0 < u>  < 180° 
without  loss  of  generality  since  only  the  angular  difference  between  the 
perigee  vectors  (F^  and  P2)  is  required. 

The  geometry  of  the  shallow  intersection  is  shown  in  Fig.  1.  Let 
the  line  FB  lie  at  the  angle  0 from  P-^  and  let  it  also  bisect  the  angle 
between  the  two  "intersections  (2e  ).  If  e is  small,  and,  0<2e<  180°: 

sin  0 = e^  sin  tu/D  (l) 

cos  0 = (e2  cos  w - />2e1)/D  (2) 

cos  « = ( p2  - l)/D  (3) 

where, 

D2  = P*  e-^2  + e22  - 2 e^  e2  cos  u>  (4) 

The  true  anomalies  of  the  intersection  points,  of  smallest  and  largest  radius 
are  0 - e and  0 *■  e respectively.  Let  the  subscript  denote  tangent 

orbits  and  assume  that  the  elements  p,  e^  e2,  and  w differ  by  small  amounts 
( Sp,  8 e-^,  ^e2>  So>  ) from  their  values  at  tangency,  and  furthermore, 
assume  that  these  perturbed  orbits  intersect.  Since  the  tangent  condition 
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Figure  1 - The  Geometry  of  Shallow  Intersections 
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requires  th$.t: 


dT2  = ( P2  - l)2  = P^  - 2 p2  e2  cosoj  + e2^  (5) 


for  € < < | one  may  write 

1 - cose  w _I_L  « y - a (cose  ) R 

2 L da;  8ai 


(6) 


j = 1 


where  a ^ ; are  the  four  elements:  p , e^,  e^,  and  tu  . 

Clearly,  Eq.  5 may  be  used  to  find  the  value  of  an  element  that  will  yield 
tangency  if  the  other  elements  are  given. 


Thus, 


e « 


-2  x 

j - 1 


cos  e 


da ! 


8a. 


(7) 


i ) 

Although  shallow  intersection  may  be  generated  by  differentially  changing 
any  of  the  four  parameters,  only  changes  in  w will  be  considered  for 
brevity  in  the  numerical  comparisons.  For  small  changes,  Sou  , one  may 
assume  = w and. 


D2  - Dt2  * ^ ®2 


- 2 


e1  e2  sin  w 8 w 


and 


€ 


P 

P2  -1 


£ COS  CU  ip  - COS  (lUip  + 8<u  )]  (8) 

(9) 

(10) 


7 2 e1  sin  cj  8 c o 
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III.  OPTIMAL  ONE-IMPULSE  TRANSFER 


One  and  two-impulse  transfers  between  coplanar  orbits  have  been 
investigated  by  Ting^\  Homer^^,  and  Barrar^^.  Although  these  authors 
did  not  specifically  consider  nearly  tangent  orbits  Barrar  does  mention  the 
possibility  of  optimizing  one-impulse  transfer  on  orbit  orientation 
It  is  convenient  to  adopt  the  notation  of  Ref.  11,  and  to  express  velocities 
and  impulses  in  units  of  p.  /p^.  Velocity  vectors  (v^  and  v„)  in  the 
initial  and  final  orbits  may  then  be  defined  as  follows: 

J~K  2i  = 1 + ®!  % (ID 

H- 

./fT  % " i + e2  a2)  (12) 

where  V is  a unit  vector  perpendicular  to  the  radius  at  the  transfer  point 
and  and  Qg  are  unit  vectors  perpendicular  to  the  perigee  vectors  (See 
Fig.  l).  The  impulse  for  the  one-impulse  transfer  maneuver,  is  expressed  as 
follows : 


V (1  - p ) + C 

P 


where 

— = e2  — 2 “ 

j2  p 2 = C2  + (1  - p )2  + 2 (1  - p ) C • V 


(13) 


(14) 


(15) 


C2 
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2 p e^  co3  w 


(16) 
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► 


* 


and 


C • V = e2  cos  (0  - € - GU  ) - p e±  cos  (0  - e ) (17) 

where  the  angles  (0  - e ) and  (0  - e - to  ) are  the  true  anomalies  of 

the  transfer  point  on  the  first  and  second  orbits  respectively.  By  using 
Eqs.  1 and  2 the  angle  0 is  eliminated  and  Eq.  17  becomes: 

C • V = cose  E2  + sine  ( p - l)  pe1  e„  sincu  (18) 

D D 12 

where 

E2  = ei2  - (1  + jj)  jo«1«2  cos  cu  + e22  (19) 

Finally, 

j2  p 2 = C2  + ( p - 1)  2 -2  ( p - 1)  |Y  COS€ 

{ p - l)  p e^  e2  sincu  sine 

~D 


One  can  now  compare  the  impulse  for  the  tangent  condition,  (JT),  with  the 
impulses  for  the  two  points  of  intersection  ( and  J2 ) by  changing  the  sign 

of  e in  Eq.  20.  Clearly,  e must  be  small  as  must  hp  , 8e1,  Se2,  and 

8 oj  . 


Noting  that 

j2  - JT2 

= (J 

""  J*p ) (j 

^ Jrp  ) ~ 2 jrp  ( j — j.p  ) 

(21) 

it  follows  that 

J “ Jrp  ® 

1 

[(-s*  - 

Ell  ) + ( U>  - l)2 

_ ( ^ T - l)2 

2Jt 

[ \ P 2 

P t2  / \ p 2 

V 

T i-H 

- 1)  E 

;2 

— cos  e 

+ 2L /t  - 1)Et2\ 

\ D 

~ 2(  P - 

P 2 

l)2  ex 

e^  sin  (jj 

dt  p72  ) 

tf  1 

(22) 

D 

■ J 
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If  each  of  the  paired  terms  in  Eq.  22  is  expressed  as  a Taylor 
series  about  the  tangent  condition  the  leading  coefficients  will  involve 
8 p,  8e^,  ^e2>  or  8a ) . However,  the  term  involving  « has  in  its 

leading  coefficient  8 ~p  , 8e^,  a/  8 e2,  or  (see  Eq.  7). 

Therefore,  as  long  as  it  does  not  have  a zero  coefficient  the  latter  will 
dominate  the  expression  as  small  changes  are  introduced.  Since  c can  be 
positive  or  negative  it  follows  that  for  one  intersection  the  impulse  to 
transfer  is  at  first  less  than  that  required  at  tangency,  and  for  the  other 
intersection  it  is  greater.  The  intersection  corresponding  to  — € is 

the  one  with  the  smaller  impulse  and  smaller  radius — a result  pointed  out  by 

(12) 

Anthony  and  Sasaki. 

If  p , ej  and  9 2 are  fixed  and  only  <o  is  varied,  Eq.  22  yields: 

Ji  - JT  » el  e2  3in“  f 2 E2  8<"  + ( P ~ l)2*  1 (23) 

JT  L ( P \ 1)  D2  P d J 

Removing  e by  using  Eq.  10  gives  (for  € positive): 

1 - ei  e2  sin  co  |-oE2  _ V2e1ez  sin  u 8 w 1 (24) 

1 Jt  ~ jT(,  + "T)  [ 2 TT1 J 

The  terms  neglected  in  Eq.  24  begin  with  S&j^/2,  and  8<u  has  to  be  positive 
in  the  direction  which  yields  the  pair  of  shallow  intersection.  Since  the 
sign  of  the  coefficient  of  8w  is  positive  Eq.  24  has  a minimum  which  is 
given  by  Eq.  25. 


( 8w  )m  = el  e2 


e-i  eo  sincu 


(25) 


8 Tp  + 1)^  EVDif 

The  corresponding  values  of  ( € )m  and  impulse  change  relative  to  tangency  are: 


(e)m  = 


p e^  e2  sin  to 

1 2 - 1 2 ( ~p  + l)  e2/d2 


(26) 
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IV.  OPTIMIZED  180  DEGREE  TWO-IMPULSE  TRANSFER 

«i 

Determining  an  optimum  two -impulse  transfer  is  in  general  a three 
parameter  problem  wherein  even  the  conditions  for  optimum  transfer  between 
coplanar  orbits  yield  extremely  unwieldy  expressions.  By  contrast,  finding 
optimum  180  deg.  two-impulse  transfer  is  a two  parameter  problem  and 
optimization  of  one  of  the  parameters  is  easily  accomplished.  In  addition, 
numerical  comparisons^13^  indicate  that  optimum  180  deg.  transfers  closely 
approximate  the  optimum  two-impulse  transfer  in  many  cases.  For  these 
reasons,  and  because  simplified  expressions  are  available  for  use  in  later 
derivations,  certain  equations  for  180°  transfer  are  presented  here. 

Considering  optimization  of  the  transfer  orbit  parameter,  the 


departure  point  being  fixed  but  arbitrary,  the  impulse  is  given  by:  (with 
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and. 


(^1  - 


el^  e22  8in^cu 
4Jt  ( P + l)3  e2/D2 


(27) 


135 


NORTH  AMERICAN  AVIATION,  INC. 


SPACE  and  INFORMATION  SYSTEMS  DIVISION 


where,  f = true  anomaly  of  the  departure  point  on  the  first  orbit, 

C32  = V I1  + ei  cos  f + [l  - e2  cos  (f  - w (33) 

= total  impulse  required  in  units  of  / H-  , 

•P1 

and, p^  = semilatus  rectum  of  transfer  orbit. 
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V.  COMPARISON  OF  ONE  AND  TWO-IMPULSE  TRANSFERS  FOR  "SHALLOWLY  INTERSECTING" 
ORBITS 

Numerical  results  were  obtained  by  first  determining  the  intersection 
of  shortest  radius  and  then  searching  for  the  optimum  180  deg.  two-impulse 
transfer  by  varying  the  departure  point.  A search  was  initiated  by  de- 
termining what  is  called  a practically  optimum  transfer  in  Ref.  13.  Numerical 
investigations  of  numerous  orbit  pairs  all  yielded  similar  results.  For 
purpose  of  illustration  three  orbit  pairs  with  very  different  values  of 
eccentricity  are  presented  here:  l)  p = 1.2,  e^  = e2  = 0.2, 

it) T = cos"1  0.6  = 53.°  1301}  2)  P 2 = 1.8,  e1  = 0.2,  e2  = 0.6, 

o>T  = 110.3741°}  and  3 ) p2  = 2.25,  e^^  = 0.6,  e2  = 0.95,  cu  T = 63.0498°. 

One-impulse  and  optimum  180  deg.  two-impulse  transfer  data  is 
shown  in  Figs.  2,  3,  and  4.  For  this  example  the  intersection  producing 
element  variation  was  obtained  by  rotating  the  final  orbit  relative  to  the 
initial  orbit.  The  two-impulse  curves  are  seen  to  coincide  with  the  one-impulse 
curves  near  the  minimum,  the  differences  between  the  two  being  in  the  computer 
noise  (8  decimal  places)  over  a small  but  finite  range  of  relative  orientation. 

A few  points  on  the  two-impulse  curve  were  investigated  by  a fully  optimized 
double  precision  two-impulse  program  (Ref.  5)  and  these  points  are  indicated 
by  the  black  dots  in  Fig.  2.  For  these  illustrations  no  significant 
difference  between  optimum  two-impulse  transfers  and  optimum  180  deg. 
transfer  is  apparent. 

An  intersection-producing  change  in  shape  can  also  be  generated  by 
varying  e^  (or  e2)  or  p . Computer  studies  of  such  cases  yielded  curves 
similar  to  those  of  Figures  2,  3,  and  4.  In  every  case  the  one— impulse 
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53?2  53°4  53°e 

ANGLE  BETWEEN  PERIGEES  (o>) 


Figure  2 - Impulse  for  One  and  Two-Impulse  Transfers  versus  ui 
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transfer  experiences  a minimum  near  tangency,  and  in  every  case  the 
two-impulse  curve  coincides  with  this  minimum  as  it  does  for  the  cases 
presented. 

Table  1 summarizes  the  results  of  using  the  approximate  formulae 
to  predict  values  of  ( Scj  )m,  (e  )m,  (j^,  - j^)m  for  the  three  cases 
illustrated.  The  values  indicated  "(pred.)"  were  obtained  from  Eqs.  25, 

26,  and  27  while  the  values  labeled  "(comp.)"  were  obtained  by  a one-impulse 
computer  program.  The  predicted  values  show  good  agreement  with  the  actual 
values;  even  for  the  highly  eccentric  orbits. 


TABLE  1.  PARAMETERS  DESCRIBING  OPTIMAL  ONE  AND  TWO-IMPULSE  TRANSFERS  NEAR  TANGEMCY 
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In  ft/sec  for  P1  = 10,800  miles 
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VI.  THE  LIMITS  FOR  EQUIVALENCE  OF  ONE-IMPULSE  AND  OPTIMUM  180  DEG. 

TWO-IMPULSE  TRANSFER 

Figs.  5,  6,  and  7 present  a sequence  of  curves  for  optimum  180  deg. 
two-impulse  transfer  in  the  region  where  the  two  curves  are  identical.  The 
first  pair  of  coplanar  elliptical  orbits  ( p - 1.2,  e-^  = 0.2, 

^2  = 0.2)  is  involved.  The  single  impulse  transfer  is  always  at  the 

point  of  discontinuity  and  it  is  to  be  noted  that  the  curves  include  this 
particular  transfer  whether  or  not  it  happens  to  be  optimum.  Each  graph 
consists  of  two  curves:  one  for  which  the  departure  point  is  near  the 

intersection  point  and  the  first  impulse  is  large  (labeled  "+"  and  referring 
to  the  positive  scale  of  departure  points)  and  one  for  which  the  arrival 
point  is  near  the  intersection  and  the  second  impulse  is  large  (labeled 
and  referring  to  the  negative  scale  of  departure  points).  It  is  thus  seen 
that  the  range  of  values  of  u)  over  which  the  best  180°  two-impulse  transfer 
reduces  to  the  single  impulse  at  intersection  can  be  indicated  by  requiring 
the  proper  curve  to  exhibit  horizontal  tangent  as  the  intersection  is  approached 
from  the  proper  side.  (Note  the  scale  changes  which  were  required  to  plot 
the  various  small  differences.)  Nearly  similar  sets  of  curves  were  obtained 
for  other  pairs  of  orbits  but  are  not  shown.  Of  course,  one  may  also  cause 
the  set  of  shapes  to  be  given  by  a range  of  values  of  e^  or  p ^ instead  of 
oj  . Again  a similar  set  of  curves  would  be  obtained,  indicating  a range  of 
values  of  the  variable  over  which  one- impulse  transfer  at  intersection  is 
identical  with  the  best  two- impulse  transfer  (ISO3  two- impulse  transfer). 
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For  all  curves  on  the  left  of  the  point  of  intersection  the  small 
impulse  is  in  the  forward  direction  since  the  angular  momentum  (proportional 
to  p 2^)  lies  between  that  of  the  initial  and  final  orbits  while  for  all  curves 
on  the  right  of  the  intersection  point  the  small  impulse  opposes  the  direction 
of  motion.  In  addition,  for  every  case  the  + curve  is  above  the  - curve  on 
the  left  and  below  on  the  right.  The  following  condition,  therefore,  bounds 
the  optimum  one-impulse  transfer  region: 

(a)  upper  limit;  + curve  has  a horizontal  tangent  on  the  right  or. 


= 0,  with  Co  = p + (34) 

= fl  + 

- curve  has  a horizontal  tangent  on  the  left  or, 

= 0,  with  C,  = 1.  + (35) 

= (180°  + f x)  - 

To  express  the  limit  conditions  determined  in  Section  V in  a more 
manageable  form,  it  is  necessary  to  evaluate  the  slope  (dC^/df)  and  use  the 
proper  sign.  In  addition,  restrict  f to  f = 0 - « in  all  equations  and 

in  the  frequently  used  expression. 

Pj/r^  = 1 + e1  cos  f = 1 + e2  cos  (f  - tu  ) (36) 

P 2 


b)  lower  limit; 


148 


NORTH  AMERICAN  AVIATION.  INC. 


SPACE  and  I N FORMATION  SYSTEMS  DIVISION 


The  equations  become 

(e^  sin  f - e2  sin  (f-  <t>)  cos  f + P + 1 j 
+ (l  '*■  cos  f)  ( p - 1)  e-^  sin  f = 


1) 

2) 


D sin  < (1  + en  cos  f)  (upper  limit)  (37) 

2 P 1 

D sin  € (l  + cos  f)  (2 p - 2 + e^  cos  f)  (lower  limit) 

2P2 

(38) 


Replacing  f by  0 - € , one  obtains  equations  involving  sin  « , 

2 

and  cos  € up  to  the  third  power.  By  substituting  cos  € ~ P - 1, 

. 2 2 D 

and  sin  e - 1 - cos  € , one  obtains  equations  which  are  linear  in 

sin  € and  may  be  solved  easily. 

sine  = ej  e2  sinfaJ  2(  />2  - l)  ( P - l)  -D2  + ( P2  - l)2  (upper  limit) 

D G2 (3?L 

5J*  j*?  P*  - 1)  ( p - 1)  - 

[D2  - (p2  - l)2][p-  2ex(p2  - l)cos0/pD2]j 

|g2  - 1_  [d 2 ( p - 2 + ex2  sin2  0 
+ D e^  ( p 2 - l)  ( p - l)  cos  0 

+ e-j2  (P2  - l)2  cos  2 0]^-  (lower  limit)  (40) 

where, 

°2  = 3 p2  e-j2  - ( p2  + 2p  + 3)  e^  e 2 cosui  + (1  + 2 ) e,,2 
A simple  iteration  scheme  was  programmed  to  solve  Eqs.  39  and  40.  When 
checked  by  using  a double  precision  program  the  limiting  values  for  cu 
corresponded  to  the  expected  horizontal  tangents  in  the  graphs  of  impulse 
versus  departure  point. 
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In  this  development  the  variations  in  shape  were  made  to  occur  as 
a result  of  variation  in  relative  perigee  angle.  The  results  (Eqs.  }°  an  i / •-) 
however,  are  quite  general.  Thus  in  order  to  determine  whether  or  not 
one-impulse  transfer  may  be  the  optimum  for  any  given  pair  of  intersecting 
orbits  one  would  evaluate  e from  cose  = ( - l)/D  and  then  determine 

whether  or  not  sine  lies  in  the  range  sine  ^ to  sin  e u-  If  so,  one-impulse 
transfer  at  the  intersection  with  the  smaller  radius  may  be  the  optimum 
impulse  transfer  between  the  two  orbits. 
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VII.  CONCLUSION 

A ssries  of  formulae  for  investigating  the  existence  and  properties 
of  optimal  one-impulse  transfers  between  pairs  of  "shallowly  intersecting" 
elliptical  orbits  have  been  developed  and  verified.  In  all  cases  tested, 
each  of  two  different  two-impulse  optimization  programs  converged  upon 
optimum  one- impulse  transfers  predicted  by  these  formulae.  Numerical 
experiments  with  orbit  pairs  obtained  using  Breakwell's  procedure  ^ ^ have 
shown  that  such  orbits  satisfy  the  conditions  specified  by  Eqs.  37  and  40. 

As  a result  one  may  now  discover  these  optimal  maneuvers  before  proceed 
with  two-impulse  optimization  procedures  such  as  those  described  in  Refs.  4 
and  5. 
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SUMMARY 


A study  has  been  made  of  minimum-fuel  transfer  and  rendezvous  between 
neighboring  low-eccentricity  orbits  by  power-limited  rocket.  This  study  in- 
cludes and  extends  previous  work  wherein  only  the  case  of  transfer  between 
circular  orbits  was  considered.  As  before,  the  analysis  is  based  on  the 
assumption  that  only  small  deviations  from  an  initial  orbit  are  allowed. 

Complete  analytical  solutions  are  obtained  in  three  different  sets  of  variables : 
(l)  rotating  rectangular  coordinates,  (2)  rotating  spherical  coordinates,  and 
(3)  Lagrange* s planetary  variables.  In  addition  to  the  determination  of 
optimal  transfer  and  rendezvous  trajectories  in  three  dimensions,  synthesis 
of  the  optimal  controls  is  also  carried  out  in  each  case.  The  guidance  coeffi- 
cients resulting  from  the  control  synthesis  are  presented  both  in  graphical 
form  and  in  equation  form  suitable  for  use  in  guidance  applications. 

The  use  of  an  intermediate  reference  orbit  is  found  to  be  a powerful 
method  of  improving  the  accuracy  of  the  linearized  theory.  Results  for 
circular,  coplanar  earth-Venus  and  earth-Mars  transfers  are  compared  with 
exact  solutions . The  linear  theory  is  shown  to  provide  a very  good  correla- 
tion with  exact  data  for  all  trip  times  of  interest. 

CONCLUSIONS 


1.  Explicit  solutions  are  obtainable  for  minimum-fuel  transfer  and 
rendezvous  between  neighboring  low-eccentricity  orbits  by  power-limited 
rockets.  These  solutions  include  closed  form  expressions  for  the  optimum 
thrust  vector,  the  optiisum  trajectory,  and  the  minimum  required  fuel  con- 
sumption in  terms  of  boundary  conditions  and  trip  time. 
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2.  Synthesis  of  the  optimal  control  has  also  been  carried  out  for  both 
transfer  and  rendezvous  between  any  orbit  and  a neighboring,  low-eccentricity 
orbit.  Guidance  coefficients  for  each  case  can  be  presented  in  terms  of  time 
remaining  to  reach  the  target  orbit. 

3.  Results  for  the  case  of  coplanar  circle -to -circle  transfer  between 
earth  and  Venus  indicate  that  the  linearized  equations  adequately  predict  the 
actual  motion,  the  optimal  control,  and  the  minimum  fuel  consumption.  There 
is,  as  yet,  no  numerical  data  to  indicate  that  the  rendezvous  equations  are 
equally  applicable  to  the  planetary  orbits.  The  failure  of  these  equations 
appears  to  be  caused  by  the  terms  representing  the  angular  motion. 


RECOMMENDATIONS 


The  results  of  the  linearized  analysis  for  earth -Mars  and  earth -Venus 
transfers  are  sufficiently  promising  to  warrant  further  investigation  into 
higher-order  theories.  In  particular,  the  "piecewise -linear"  theory  des- 
cribed herein  is  a relatively  straightforward  application  of  the  linearized 
equations  which  should  include  at  least  some  second-order  effects  on  the 
motion.  It  is  recommended  that  this  approach  be  pursued  because  a simple 
second-order  solution  is  highly  desirable. 


It  is  characteristic  of  high-specific -impulse,  low- thrust  propulsion 
systems  that  the  source  of  power  is  separate  from  the  thrust  device  itself. 
Consequently,  such  propulsion  systems  are  referred  to  as  power-limited,  since 
thrust  is  restricted  in  magnitude  by  the  output  of  the  power  supply,  which  is 
in  turn  limited  by  the  necessity  of  minimizing  power  supply  weight. 

The  problem  of  transfer  and  rendezvous  between  neighboring  orbits  by  a 
power-limited  rocket  is  of  interest  for  two  basic  reasons.  First  of  all,  the 
problem  can  be  solved  analytically,  as  was  demonstrated  in  Refs.  1,  2,  and  3 > 
provided  that  the  thrust  acceleration  is  not  constrained  in  magnitude  and 
that  the  proper  simplifying  assumptions  are  made  in  the  mathematical  model 
of  the  system.  The  analytic  expressions  thus  obtained  for  the  controls  and 
for  the  optimum  trajectories  then  provide  insight  into  more  general  problems 
where  the  simplifying  restrictions  are  lifted.  Secondly,  the  solution  to 
this  problem  provides  a lower  bound  to  the  performance  requirements  for  low- 
thrust  orbital  transfer  and  rendezvous. 

It  is  interesting  to  note  that  if,  for  the  same  system  model  as  has  been 
used  herein,  the  thrust  acceleration  is  assumed  constant,  analytic  integration 
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of  the  equations  of  motion  requires  the  evaluation  of  incomplete  elliptic 
integrals  of  the  third  kind  (Ref,  4).  The re f or e,  allowance  for  variable -thrust 
acceleration  is  essential  if  simple  analytic  solutions  are  to  be  obtained. 


ANALYTICAL  METHOD 


Description  of  the  Mathematical  Model 

The  phrase  'neighboring  orbits",  as  defined  here,  requires  that  the 
inclination  between  orbit  planes  be  small  and  that  the  radial  separation 
between  orbits  be  small  relative  to  the  semi -major  axis  of  either  orbit. 

If  it  is  further  assumed  that  motion  in  the  transfer  orbit  does  not  deviate 
significantly  from  these  neighboring  orbits,  linearization  of  the  equations 
of  motion  is  permissible. 

The  analysis  has  been  carried  out  in  three  sets  of  variables:  (l)  rotating 
rectangular  coordinates,  (2)  rotating  spherical  coordinates,  and  (3)  Lagrange’s 
planetary  variables . The  rotating  coordinates  have  been  utilized  previously 
in  Refs.  5,  6,  and  7,  while  the  planetary  variables  were  applied  to  an  orbit 
transfer  problem  in  Ref.  4. 

The  rotating  coordinate  systems  are  depicted  in  Figs.  1 and  2.  Each 
consists  of  an  origin  which  revolves  at  satellite  velocity  in  the  initial 
(interior)  circular  orbit  and  orthogonal  coordinates  measured  from  this 
revolving  origin.  In  the  rectangular  system  of  Fig.  1,  y'  is  a radial 
dimension,  x’  is  measured  tangent  to  the  initial  orbit  at  the  origin,  and 
z'  is  a coordinate  which  is  out  of  the  plane  of  the  initial  orbit  and  is 
normal  to  both  x’  and  yT  . 

In  Fig.  2,  the  spherical  system  is  composed  of  a radial  coordinate  y, 
an  arc  x,  measured  circumferentially  from  the  origin,  and  another  arc  z, 
which  is  orthogonal  to  the  x-y  plane. 

The  Lagrange  plane tary  variables,  which  are  derived  from  the  elements  of 
an  elliptic  orbit  and  are  used  in  the  standard  variation-of-parameters 
equations  of  celestial  mechanics  (Kef.  8),  are  convenient  because  they  elimi- 
nate the  necessity  of  treating  singularities  for  zero  eccentricity  and  zero 
inclination  in  these  equations.  As  they  are  used  in  this  study,  the  planetary 
variables  consist  of  the  nondimens ionali zed  semi -major  axis  x1  = a/a0,  a 
circumferential  distance  component,  x4,  and  the  following  combinations  of 
the  remaining  orbital  elements : 


Xg  ~ e sin  uu 

x3  a e cos  uu 

Xg  - sin  i sin  0 

Xq  = sin  1 cos  0 


a) 
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in  that  "fast"  trajectories  are  allowed  only  when  the  linearizing  assumptions 
my  he  relaxed.  On  the  other  hand,  fast  trajectories  are  allowed  in  the 
rectangular  system  because  no  limits  are  placed  on  the  component  velocities 
in  the  linearizing  process. 


Analysis 

The  optimization  problem  is  to  derive  the  optimal  control  equation  for 
the  minimum-fuel  transfer  or  rendezvous  of  a power-limited  rocket  between 
neighboring  orbits  in  a given  time.  Mathematically,  this  requires  minimi- 
zation of  the  integral 

J = Jcf  (T/m)2  dt  = JQf  (n0/2)  A2  dr  = f0  (A)  dr  (2) 


subject  to  constraints  imposed  by  the  equations  of  state  -which  may  be  expressed 
in  the  form 


xt  = ft  (x,  A)  i = 1,  . . . , n 


(3) 


The  control  is  the  thrust  acceleration  vector,  A,  in  the  present  case. 


— The  problem  is  Treated  as  a problem  of  Lagrange 
tions . In  particular,  Breakwellfs  formulation  (Ref. 
used  because  the  linearized  equations  in  the  present 
well  suited  to  this  formulation. 


in  the  calculus  of  varia- 
9)  of  this  problem  is 
case  are  particularly 


If  a fundamental  function  F is  defined  as 


n 

F = -f0  + S \tt± 
i=l 


(4) 


the  variational  treatment  requires  satisfaction  of  Euler -Lagrange  equations 
in  the  following  form  as  necessary  conditions  for  the  existence  of  an  extremal 
arc: 


d\t  _ dF 
dr  dxt 


(5) 


— = 

dAj  " 


0 


(6) 
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where  e is  eccentricity,  <u  is  the  longitude  of  peri -apsis,  i is  orbital 
inclination,  and  Q is  the  longitude  of  the  ascending  node.  The  planetary 
variables  provide  a simple  means  of  introducing  eccentricity  into  the  termi- 
nal orbits,  and  the  form  of  the  state  equations  using  these  variables  is 
particularly  simple  in  the  present  problem.  However,  in  a practical  appli- 
cation, they  might  be  less  desirable  than  the  rotating  coordinates  because 
the  orbital  elements  cannot  be  directly  measured. 

In  view  of  the  foregoing  considerations,  eccentric  terminal  orbits  have 
been  allowed  only,  in  the  planetary  variables  in  this  study,  while  the  analysis 
in  rotating  reference  frames  is  confined  to  circular  terminal  orbits. 

It  should  be  noted  here  that  the  three  sets  of  variables  are  entirely 
equivalent  in  that  the  equations  of  motion  may  be  transformed  directly  from 
one  set  to  another  by  substitution.  There  are  some  differences  in  the  required 
linearizing  assumptions  which  should  be  mentioned,  however. 

Consider  the  coordinate  system  depicted  in  Fig.  1,  a rectangular  system 
with  its  origin  fixed  on  the  interior  orbit  (assumed  to  be  the  reference 
orbit)  in  the  xf,  yf  plane.  The  mutually  orthogonal  coordinates  xf,  yr,  and 
zf  form  a triad  that  revolves  with  angular  speed  nQ  characteristic  of  the 
reference  orbit,  so  that  motion  in  this  frame  of  reference  is  relative  to  a 
point  on  the  reference  orbit.  The  spherical  coordinate  system  in  Fig.  2 is 
described  by  the  arc  x in  the  plane  of  the  reference  orbit,  the  arc  z measured 
normal  to  this  plane,  and  a radial  dimension  y. 

In  order  to  linearize  the  equations  of  motion  in  the  first  system,  it  is 
necessary  to  assume  that  excursions  x',  y’,  and  z'  from  the  origin  be  small 
in  comparison  with  the  radius,  rQ,  of  the  reference  orbit.  Motion  is  there- 
fore constrained  to  a small  sphere  about  the  origin.  No  restrictions  are 
placed  on  the  component  velocities.  In  the  rotating  spherical  system,  only 
the  assumption  of  small  component  velocities  will  linearize  the  equations, 
whereas  the  arc  x is  not  limited.  The  resultant  motion  is  constrained  to 
a torus  about  the  reference  orbit. 

Since  the  linearized  equations  of  motion  are  identical  except  for 
differences  in  notation  (Ref.  5),  one  can  draw  the  conclusion  that,  if  in 
the  spherical  system  the  resultant  motion  does  not  involve  large  variations 
in  x,  the  velocity  components  may  be  large.  In  the  present  study,  use  of 
the  spherical  system  has  been  assumed  throughout,  and  the  results  may  be 
extended  according  to  the  foregoing  discussion. 

In  the  case  of  the  planetary  variables,  the  linearizing  assumptions 
require  that  the  difference  in  the  semi -major  axes  of  the  terminal  orbits 
be  small  and  that  the  eccentricity  of  the  terminal  orbits  as  well  as  the 
eccentricity  of  the  instantaneous  transfer  orbit  be  small.  The  implications 
of  these  assumptions  are  similar  to  those  for  the  rotating  spherical  system 
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An  additional  necessary  condition  provided  "by  the  Pontryagin  Maximum  Principle 
must  also  be  satisfied  to  ensure  that  the  stationary  solution  predicted  by 
the  Euler  equations  is  actually  an  extremum.  The  maximum  principle,  which 
may  be  expressed  as 


F (xt,  Xj,  Af)  a.F  (x0  Xj,  Aj ) 


(7) 


ensures  that  the  stationary  solution  is  an  absolute  maximum.  Furthermore,  it 
has  been  shown  (Ref.  10 ) that  for  a system  where  both  the  state  variables  and 
the  controls  appear  linearly  in  the  state  equations,  the  maximum  principle 
is  also  sufficient  to  ensure  a minimum  of  the  payoff,  J.  Since  all  cases  in 
the  present  analyses  are  linear  in  the  controls  and  satisfy  the  maximum 
principle,  the  optimum  trajectories  described  herein  are  absolute  extrema. 

Due  to  the  great  number  of  equations  involved,  the  variational  analysis 
is  not  described  in  each  case.  Only  the  most  important  equations  are  included, 
and  these  are  grouped  in  an  orderly  fashion  in  the  appendixes.  The  rotating 
coordinate  systems  are  considered  in  Appendix  I,  and  the  planetary  variables 
are  considered  in  Appendix  II.  For  a more  detailed  account  of  the  application 
of  the  aforementioned  equations  the  reader  is  referred  to  Ref.  2 wherein  a 
specific  case  is  treated  in  detail. 


Synthesis  of  the  Optimal  Controls 

In  order  to-  put  the  equations  for  the  optimized  controls  into  a form 

compatible  with  guidance  requirements,  several  changes  are  made.  First,  t 
in  the  control  equations  is  replaced  by  -t.  That  is,  the  equations  are 
rewritten  with  "time-to-go"  as  the  independent  variable.  Secondly,  while  in 
the  ordinary  transfer  and  rendezvous  analyses  in  rotating  coordinates  it  was 
generally  convenient  to  assume  zero  initial  conditions,  the  terminals  are 
reversed  in  the  control  synthesis.  That  is,  the  target  orbit  is  assumed  to 
be  defined  by  zero  values  in  most  of  the  state  variables.  The  results  of 
the  control  synthesis  are  expressed  in  terms  of  the  guidance  coefficients, 
dAj/dXj,  of  each  component  of  the  control  vector,  A. 

The  equations  for  the  control  synthesis  are  summarized  in  Appendix  III 
for  transfer  and  rendezvous  in  each  of  the  coordinate  systems . Those  equa- 
tions which  deal  specifically  with  transfer  between  circular  orbits  have 
been  presented  previously  in  Ref.  3- 
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RESULTS 


Orbit  Transfer  and  Rendezvous 

The  multiplicity  of  solutions  generated  in  this  study  (particularly  for 
rendezvous)  precludes  a graphical  presentation  of  all  the  resulting  tra- 
jectories. An  attempt  is  made  to  summarize  the  results  in  a reasonably  con- 
cise form  with  orbit  transfer  solutions  represented  as  special  cases  of 
rendezvous  wherever  feasible. 

To  simplify  the  presentation  of  the  results,  only  circle -to -circle 
transfer  and  rendezvous  cases  are  examined  in  the  summary  curves  of  Figs.  3 
through  13.  The  first  set  of  plots.  Figs.  3 through  5,  shows  the  variation 
of  the  components  of  the  optimal  thrust  acceleration  with  time  for  circle -to - 
circle  transfer  only. 

The  in-plane  components  Ax/yf  and  Ay/yf  are  seen  to  display  symmetry 
about  the  midpoint  in  time  for  all  trip  times,  as  does  the  out -of -plane  com- 
ponent Az/r0i.  In  particular,  when  Tf  = 2nrr,  the  components  A x/yf  and  Ay/yf 
are  constant  with  time,  and  the  latter  is  zero.  For  the  coplanar  problem, 
constant  circumferential  thrust  acceleration  is  thereby  specified  as  the 
optimum  mode  for  integral  multiples  of  the  period  of  the  reference  orbit, 
a result  that  is  in  agreement  with  Ref.  7* 

Figures  6 through  8 show  the  thrust  acceleration  components  for  circle- 
to-circle  rendezvous  at  a particular  trip  time  equal  to  one  sixth  of  an 
orbital  period  of  the  reference  orbit.  The  parameter  in  Figs.  6 and  7 is 
xf  ht  Tr  which  takes  on  the  value  of  3 A for  the  special  case  of  optimum 
transfer.  Similarly  the  out -of -plane  component  is  plotted  with  Qf  as  a 
parameter.  As  indicated,  the  longitude  of  the  node  can  have  either  of  two 
values,  150  or  33O  deg,  for  optimum  transfer. 

The  payoff,  J,  can  be  best  represented  as  the  sum  of  three  components, 
Jx,  J2 , and  J3,  which  are  defined  by  Eqs . (a-44)  and(A-45 ) and  are  plotted  in 
Figs.  9 through  11.  The  components  J1  and  Js  define  propellant  requirements 
for  coplanar  rendezvous,  while  the  addition  of  J3  introduces  the  out-of-plane 
requirement.  In  particular  J is  equal  to  J1  for  coplanar  transfer  since  the 
term  xf/yfTf  - 3A  in  J2  is  zero  for  optimum  transfer. 

All  three  components,  as  well  as  their  sum,  are  seen  to  be  monotonically 
decreasing  functions  of  Tf  . In  the  limit, as  Tf  A and  J - 0.  This  is  a 

consequence  of  the  fact  that  no  limit  has  been  placed  on  exhaust  velocity. 
Similarly  all  three  components  tend  to  infinity  as  Tf  approaches  zero  because 
zero  trip  time  requires  infinite  thrust  acceleration. 

An  interesting  feature  of  J3  is  evident  from  Fig.  11.  For  Tf=kn, where 
k = 0,  1,  2,  . ..,  J3  is  the  same  for  all  nodal  longitudes,  Qf  . For  all  other 
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times  the  envelope  of  the  family  of  curves  is  given  by  the  equations 


^3  max  ~ 


Tf  - | sin  Tf  | 


J3n 


1 n 


Tf  + |sin  Tf 


(8) 


(9) 


where  the  lower  envelope  is  given  by  Eq.  (9)  and  represents  J3  for  optimum 
transfer. 


Choice  of  Reference  Orbit 

It  has  been  observed  that  the  linearized  equations  are  applicable  only 
for  orbits  which  are  not  separated  by  large  radial  distances.  More  specifi- 
cally, excursions  from  the  origin  in  the  y direction  should  always  be  small . 

It  is  apparent,  however,  that  when  the  reference  orbit  is  chosen  to  have  the 
same  radius  as  the  initial  orbit  the  excursion,  y,  to  the  final  orbit  is 
maximized.  A better  reference  orbit  would  be  one  midway  between  the  terminal 
orbits  since  this  device  would  guarantee  a radial  excursion  no  greater  than 
half  the  distance  between  the  terminals  . 

Although  for  the  most  part,  the  equations  of  this  report  arejmsecl  on ._a_ 
reference  ^rbitr rrcincittent^itfr^he^^  Eqs  . ~(A-55)^ through  ( A-51 ) 

and  (A-131)  through  (A-134)  are  exceptions  in  this  respect.  These  equations 
are  derived  to  account  for  an  arbitrary  choice  of  the  reference  orbit  and  may 
therefore  be  applicable  in  cases  where  the  ordinary  equations  break  down. 


Application  to  Planetary  Orbits 

Strictly  speaking,  none  of  the  planetary  orbits  are  "neighboring  orbits" 
in  the  sense  in  which  this  term  has  been  defined.  Earth fs  closest  neighbor, 
Venus,  has  a semi -major  axis,  a = 0.7233AU  compared  with  a = 1.0AU  for  earth, 
leaving  a separation  distance  of  O.2767AU  which  is  not  « 1.0AU.  However, 
using  the  improvement  referred  to  above,  it  is  possible  to  apply  the  linearized 
analysis  to  earth-Venus  and  earth-Mars  trajectories  with  remarkably  good 
accuracy.  In  Figs.  12  and  13,  comparisons  have  been  made  with  exact  solutions 
from  Ref.  11,  for  earth-Venus  and  earth-Mars  transfers.  The  circled  points 
were  calculated  from  Eq.  (A-48)  of  Appendix  I using  a reference  orbit  midway 
between  the  two  terminal  orbits . These  results  for  the  special  case  of 
uninclined,  circular  terminal  orbits  show  only  a slight  discrepancy  in  J 
for  transfer  times  up  to  one  earth  year. 
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Extension  of  the  Linearized  Theory- 

Based  on  the  successful  correlation  indicated  hy  Figs.  12  and  13,  a new 
theory  is  being  considered  in  order  to  account  for  second-order  effects  in  J . 
This  theory  is  a "piecewise -linear"  analysis  which  may  he  described  as  follows: 
The  transfer  (or  rendezvous)  is  divided  into  two  steps,  each  requiring  a 
portion  of  the  total  trip  time.  The  first  segment  of  the  trajectory  consists 
of  a rendezvous  from  the  initial  orbit  to  an  intermediate  orbit  of  unspecified 
size  and  shape,  and  the  second  segment  is  a rendezvous  from  this  intermediate 
orbit  to  the  final  terminal  orbit.  The  expression  for  J is  composed  of  two 
linear  expressions  for  the  two  segments,  and  the  parameters  of  the  intermediate 
orbit  are  considered  as  variables  which  may  be  optimized  so  as  to  minimize  the 
total  J.  In  each  segment  an  appropriate  - ref erence  orbit  is  chosen  so  as  to 
improve  the  accuracy  of  the  theory . 

This  approach  should  provide  better  results  than  the  linearized  theory . 
Since  the  results  for  earth-Mars  and  earth-Venus  transfers  were  already  good, 
the  piecewise -linear  theory  may  approach  exact  results  in  these  cases  and 
might  even  yield  acceptable  results  for  trajectories  to  the  outer  planets . 


Control  Synthesis 

In  this  study  it  has  been  possible  to  express  each  of  the  components  of 
the  optimal  control  vector,  A,  as  a linear  function  of  the  n state  variables. 


n 


Aj  - S 

i=l 


(10) 


Therefore,  the  presentation  of  the  results  can  be  confined  to  curves  of  the 
guidance  coefficients,  dAj/dXj  plotted  against  time  to  go,  t'.  Using  the 
equations  for  the  guidance  coefficients  which  comprise  Appendix  III,  the 
summary  curves  of  Figs.  l4  through  25  were  generated. 

The  synthesized  controls  for  the  case  of  transfer  between  an  arbitrary 
state  and  a nearby  circular  orbit  appear  in  Figs.  1^  through  l6  in  terms  of 
the  rotating  coordinate  system  variables.  The  extension  to  include  eccentricity 
of  the  final  orbit  is  provided  by  use  of  the  Lagrange  planetary  variables  in 
Figs . 17  through  19 - 

For  rendezvous  the  same  procedure  is  followed  in  the  presentation  of  the 
synthesized  controls,  with  the  addition  of  curves  to  account  for  the  dependence 
of  in -plane  thrust  acceleration  components  on  the  circumferential  distance. 

In  rotating  coordinates.  Figs.  20  through  22  summarize  the  results  for  rendezvous 
between  any  initial  state  and  a point  on  a nearby  circular  orbit. 
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As  in  the  transfer  case,  the  planetary  variables  facilitate  the  extension 
to  rendezvous  between  an  initial  state  and  a point  on  a nearby  orbit  of  low 
eccentricity.  The  results  for  the  planetary  variables  appear  in  Figs.  23 
through  25. 

All  the  curves  for  the  guidance  coefficients  display  similar  behavior. 
When  time -to -go  is  short,  the  curves  diverge  to  infinity  (either  positive 
or  negative),  but  a damped  oscillation  is  evident,  causing  the  coefficients 
to  approach  zero  for  very  long  times. 
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A 

C 

f 

F 

J 

D 

B 

Q 

$ 

X 

r 

R 

W 

S 

n 

x,  y,  z 
x',  y',  z' 
U,  V,  V 


LIST  OF  SYMBOLS 

Thrust-to-mass  ratio 

1_  T 

n m 
o 

Integration  constant 

Rate  of  change  of  a state  variable 

Fundamental  function 

Defined  by  Eq.  (2) 

Defined  by  Eq.  (A-15^) 

Defined  by  Eq.  (A-182) 

Defined  by  Eq.  (A-l8l) 

Defined  by  Eq.  (A-146) 

Lagrange  multiplier 
Radius 

Radial  force 
Normal  force 
Circumferential  force 
Mean  angular  motion 

Position  components  in  spherical  system 
Position  components  in  rectangular  system 
Velocity  components  in  x,  y,  z,  directions 
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LIST  OF  SYMBOLS 
(contd.) 


t*  Time  to  go 

7)  True  anomaly 

to  Longitude  of  peri -apsis 

e Eccentricity 

N Unit  vector  normal  to  instantaneous  transfer  orbit 

a Semi -major  axis 

Q Longitude  of  the  node 

i Inclination 

x1  a/aQ 

Xq  e sin  <u 

X3  e cos  ou 

x5  sin  i sin  Q 

Xe  sin  i cos  Q 

c Angular  momentum  vector 

Subscripts 


i Index  denoting  x,  y,  z,  u,  v,  w 

j Index  denoting  x,  y,  z 

0 Initial  condition 

f Final  condition 

x,y,z,u,v,w  Denoting  state  variable 

1 Intermediate  reference  orbit 

R Radial 
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LIST  OF  SYMBOLS 
(contd. ) 


S Circumferential 

W Normal 

Superscripts 

* Optimum  condition 


Denotes  a vector 
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APPENDIX  I 

ROTATING  RECTANGULAR  AND  SPHERICAL  COORDINATE  SYSTEMS 


X.  = o 

X y “ — 3 X y 

Xz  = Xw 

X*u  = “ XM  + 2Xy 

Xy  - “ Xy  2 Xy 

Xy y * X J 

Xu  - n0A* 

Xy  = Ho  Ay 

Xy  * OqAj 


(A-l) 

(A-2) 

(A-3) 

(A-4) 

(A-5) 

(A-6) 

(A-7) 

(A-8) 

(A-9) 

( A-10) 
(A-ll) 
(A-12) 
(A-13) 
(A-14) 
(A-15) 
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3-  Integrated  Euler -Lagrange  Equations 


Xu  • Hq  C0 

Xy  = -6no(C4  + C0r  — C|  cost  + C2sinr  ) 

X*  = 2n0(C9$inr  + C9cosr) 

Xu  = n0  ( 3C4  + 3CoT  - 4C,  cosr  + 4C2sinr) 
Xy  s 2n„(  C0  + C,  sinr  + C2cosr) 

Xw  = 2nc(  C9  cost  - C,  sinr  ) 


4 . Boundary  Conditions 

Transfer 


Rendezvous 


State  Variable  r =0 

» 0 

y 0 

i 0 

5T  0 

V 0 

w 0 


r = rf 
FREE 

y» 

2 yf 
0 


r =0 

0 

0 

0 


(2) 


0 

0 

0 


T s Tf 

Vf 

-lv“> 

2 yf 

0 


:( 2) 


5 • Integrated  Equations  of  State  (with  initial  conditions) 

* = [ I6(r  - sinr  )- -l-r’JCo  + [|6(I -cost)  - I0r  sinr]c, 

+ [22 sinr  -lOrcosr  - I2r]c2  - [-|-  r*  - l2(l-cosr )]  C4 

y = [e(l  -cost)  - 3r*]c#+  5[sinr  - rcosr  ] C,  + [ 5rsinr  -8(l-cosr)] 
+ 6 [sinr  -r  ] C4 

(1)  REF  6 

(2)  REF  5 


(A-l 6) 

(A-17) 

(A-18) 

(A-19) 

(A-20) 

(A-21) 


(A-22) 

(A-23) 
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z 


t cos  r - sin  r 


]*♦[ 


r sinr 


u = [ I6(  I — cost)  - r 2 j C0  + [ 6 sinr  — 10 r cost  j C( 

+■  [ lOr  sinr  - I2(  I - cost  ) j C2  + [ 12  sinr  - 9rJ  C,. 

v = [ 8 sinr  - 6r  j C0  + [ 5r  sinr]  C,  + [ 5r  cosr  - 3sinr]c2 

+ 3 [ I - cosr]  C4 

w = [-  r sinr]  Cj  4-  [sinr  + rcosr]c9 

6.  Transversality  Conditions  - Transfer 

X,  * C0  s 0 

_C»  _ tonTf + if- 

I - -jj-  *onrf 

7.  Constants  of  Integration  - Transfer 

yf  sin  rf 

l6(l-cosrf)  - rt(5rf  +•  3sinrf) 

- yf  (I  - cosrf ) 

I6(  I - cosrf)  - rf(5rf  + 3sinrf) 

(sinrf  + rf  cosrf  )zf  - (rfS inrf  )J7*  I*  - z* 

2 2 
T|  - sin  rf 


C, 

C2 

C, 


(A-24) 

(A-25) 

(A-26) 

(A-27) 

(A-28) 

(A-29) 

(A-30) 

(A-31) 

(A-32) 
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5rf+  3sinrf ) 

16  ( I - COST,)  -t,(5t,  + 3sinrf) 


Rendezvous 


c0  = 


r»  y»( 


J ii_ 

Vfrf 


) ( 5t,  — 3sinr, ) 


-|-rf  (5t,  — 3sinr,)(T,2  -80)  + 4(1  - cosi,)(  7It,2-64)  + 248r,  2 cost, 


C,  = 


y,  si  nr, 


16(1  — COST,)  - r,  ( 5 t,  + 3sinr,  ) 


3sinr,  - 8(  I -cost,) 
5t,  - 3sinT, 


C,  = 


- y,  ( I - cos  t,  ) 


2 16(1  - cost,)  - t,(5t,  + 3 sinr,) 


+■  C„ 


3t,(  I + cost,)  - 8 sinT, 
5t,  - 3sinT, 


( sinT,  + t,co$  t,  ) z,  - ( T,sinT,)  </r0  2 I \ - Zf2 


(t,2  - sin2T, ) 


c4 


x 


(5t,+-  3sinrf) 

I6(  I - cost,)  - t,(5t,  + 3 sinT, ) 


C0 


2 


(T,sinT,)z,  4-  (t,cost,  - sinT,) 
(Tf2  - sin*T,) 


2 . 2 


- 2f 


8.  Controls 


A,  = 3C4+  3C0t  - 4 C,  cost  +•  4C8sinT 


(A-33) 


(A-34) 


(A-35) 


(a-36) 


(A-37)- 


(A-38) 


(A-39) 


(A-40) 
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Ay  = 2 [c0+  C,  sinr  + C2  cost  ] (A-4l) 

Az  = 2 [c5cosr  - CjSinrj 

(A-42) 


9,  Payoff 

Transfer 


J 


f\>3  ro* 


(~77)2  (5rf->-3sin^) 

+ 3 sin  t, ) - 16  ( I - cosrf )] 


t,  +|sinrf|  (A-43) 


Rendezvous 


(A-44) 


n0V 


(A^(5rf  + 3sinrf) 

e[rf(5rf  + 3sinr()  - 16(1  - cost,)]  (a-45) 

" t)2  <5T'-5si"r'1 

-4~  — " " — — “ 2 2 

-J  rf  ( 5rt  - 3sinr,  )(r,2-  80 ) +■  4(  I - cost,)(7|t,  -64)  + 248rf  cost, 


. 2 r Tf  - sinT,  COS  ( 2fl,+  T,  ) 

*■  (t,2  — sin2T,) 

10.  It  should  be  pointed  out  that  for  each  free  end  condition  in  the  case  of 
orbit  transfer,  the  variational  analysis  predicts  an  optimum  value  for  that 
particular  state  variable  at  the  end  point.  In  the  rotating  coordinate  systems 
the  x and  z coordinates  are  left  open  at  final  time,  Tf . The  end  point  for 
the  optimal  transfer  is  then  determined  in  the  analysis  and  is  defined  by  the 
equations . 


176 


c -910098-12 


(*r- 


(A-46) 


(a-47) 


11.  Payoff  Equations  with  an  Intermediate  Reference  Orbit 

Let  the  origin  revolve  in  a circular  orbit  of  radius  rx  between  the  two 
terminal  orbits  such  that  the  radial  distance  to  the  outer  orbit  is  rf  -rx 
and  the  radial  distance  to  the  inner  orbit  is  rx -r0 . The  radii  r0  and  rf 
refer  to  the  inner  and  outer  orbits,  respectively. 

Transfer 


-.V 


4r  ( Il7~q')2(  + 3 sinTf ] 

rf  ( 5rf  + 3sinrf ) - I6(  I - cosrf) 


Tf  + I sinTf 


(a-48) 


Rendezvous 


(-hr) ( 5rf+  3sinrf ) 


8 v rj 


nx3 r-j.2  rf  ( 5rf  + 3sinrf ) - 16  ( I - cosrf ) 

*2  3 / rf  + r0 


rf  ( 5rf  - 3sinrf )(  rf  2 - 80  ) + 4(  I -cosxf  X 7irf2  -64  ) + 248  rf  2 cosrf 


(a-49) 


+ 


rfz—  sinrf  cos  ( 2^7 f +rf) 

2 7~z 

rf  — sin  rf 
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APPENDIX  II 
LAGRANGE  rS  VARIABLES 


In  the  theory  of  special  perturbations,  as  derived  in  Ref*  8 for  example,  - 
the  equations  for  rates  of  change  of  the  elements  of  an  elliptic  orbit  are 
written  in  terms  of  the  elements  and  acceleration  components  S,  R,  and  W, 
which  are  perpendicular  to  the  radius  vector,  radial  and  normal  to  the  orbital 
plane,  respectively. 

Consider  the  five  elements,  a,  e,  i,  u>,  Q.  The  equations  for  small  rates 
of  change  of  these  variables  are 


do 

dt 


+ 


S(l  + e 


(A-52) 


de 

dt 


•/l  - e2 

no 


[ R sin  17 


2 cos  7)  + e + 


e cos  zrf 


I 4-  e cos  17 


(A-53) 


W cos(uH -r)) 


(A-54) 


dw 

dT 


noe 


[ - R cos  17 


2 +■  e cos-n 
I +ecostj^  S,n^ 


e ton  -g-  sintcj+Tj) 
l + ecosij 


(A-55) 


d a 

dt 


■■■■-■■  sin(w+-n) 
sin  1 1 


(A-56) 


In  order  to  avoid  singularities  for  zero  eccentricity  and  inclination  in 
Eqs.  (A-55)  and  (A-56)  these  equations  may  he  transformed  according  to  the 
following  definitions: 


x2  = e sinw 


(A-57) 
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*3 

= e cosw 

(A-58) 

*5 

» sin  i sinii 

(A-59) 

*6 

* sin  i cos  £i 

(a-6o) 

Under  the  assumptions 

e < < | 
0 * o0 
n * n0 

(a-6i) 

r 

- = w + ij 

i < < 1 

R 

A . s w 

(a-62) 

Ap  • 7"  * 

R o0n0 

5 Oono  °o"o 

and  with  the  further  definitions 

Q 

x,  = — 

1 Oo 

(A-63) 

X4  = X 

(A-6k) 

the  equations  of  state  for  the  variational  problem  may  be  derived  from  Eqs . 
(A“52)  through  (A-60). 

There  is  a direct  equivalence  between  these  equations  and  the  equations 
of  state  in  the  rotating  coordinate  system  variables.  That  is,  each  of  the 
Lagrange  variables  xx , x2,  x3,  . ..,  Xg , can  be  expressed  in  terms  of  the 

rotating  coordinate  variables,  x,  y,  z,  u,  v,  and  w. 

Referring  to  Fig.  26,  define  a position  vector  r in  nonrotating 
coordinates  originating  at  the  center  of  attraction  F.  Assume  the  motion 
out  of  the  reference  plane  is  uncoupled  from  the  in -plane  motion. 

Relative  to  a rotating  rectangular  coordinate  system  originating  at  0 
and  rotating  with  angular  velocity  n this  vector  is 


0910098-12 


r = * i 4*  ( r0  + y ) j 


(A-65) 


where  the  unit  vectors  i and  j are  taken  in  the  x and  y directions,  respectively . 
The  vector  velocity  V is  obtained  by  differentiating  r. 

V * = ui  4-vj  + nxr  (A -66) 

— » — » — * 

Since  n = nQk,  the  expression  for  V is 

v = [ u - n0(  r0  + y)]  i + ( v + n0x  ) j (A-67) 

Using  Eqs.  (A-65)  and  (A-67),  expressions  can  be  written  for  the  angular 
momentum  C,  the  path  speed  V and  the  radius  r of  the  vehicle 


[»< 


C = rxV  * I x(  v + n0x  ) - ( r0  + y )(  u - n0U0  + ym  k 


y>)]  7 (A-68) 


v = V7.7  S ^[0  - n0{  r0+y)]2  4-  [v  +n0*  * 


(A-69) 


r - >/  T • 7 = /*2  + (r0 


+ y) 


(A-70) 


The  folloving  equations  can  be  written  for  the  angular  momentum,  speed,  and 
radius  of  a body  in  an  inverse  square  field. 


(A-71) 


1 c 1 = yKo(  1 - •*) 


V 


* y<?)'  + (r^)2 


(A-72) 


r = q(l  -e2) 
l+e  cos  Tj 


(A-73) 


181 


c -910098-12 


Combining  these  equations  with  the  absolute  value  of  C,  and  with  V and  r 
from  Eqs.  (A-68),  (A-69),  and  (A-70),the  following  scalar  equations  result. 


~ * ( I + ) ( I + e cos  ) 

°o  r» 


u 


v . x . / e cos  2 

V.  + r.  V SL 

Oq 


Finally,  noting  that 


0 

o0 


x,  , x2  s e sinw  , x,  * e cosw 


e cos yj  » ecos(r-w) 


x2  sinr  + x,cosr 


(A-7 


(A-7 


(a-7 


(A-7 


the  equations  relating  the  coordinates  are  obtained. 


JL 

'o 


( x,  - I ) - x2  sinr  - x,  cost 


(A-l 


v 

rv»r0 


X,  COST  - x2  sinr 


(A-7 


■— — = ( x.  - I ) - 2x2  sinr  - gx.cosr  (A"f 

noro  2 1 2 * 

The  components  of  the  out-of -plane  motion  can  be  related  in  the  following 
way.  If  i is  a unit,  vector  normal  to  the  instantaneous  transfer  orbit  and  s 
is  a unit  vector  in  the  direction  of  the  line  of  nodes,  then 


s'  = N*  x K 


(A- 


and,  since  the  angle  between  s and  the  vehicle  is  t - fl, 
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COS(T-il)  = s • i 


(A-82) 


Also,  the  orbital  inclination  is 


cos  i 


N • k 


(a-83) 


Using  these  parameters  the  equation  for  the  elevation,  z,  of  the  probe  is 


(A-84) 

7-  = toni  sin(r-Jl)  “ sin i sin(r  -SI) 

ro 

or 

-h  s - *»cosT  + x«sinr  (A-85) 

The  out -of -plane  velocity,  w,  is 


1.  Equations  of  State 


w 

_5e.rp 


d»i 

dr 

dx2 

dr 

d*3 

dr 

d*4 

dr 

d*S 

dr 

dx6 

dr 


= *8  sinr  + x6  cosr 

(a-86) 

(A-87) 

2AS 

2AgSinr  - AR  cosr 

(a-88) 

2Ascosr  + Ar  sinr 

(A-89) 

3 

— (x,-  1 ) - 2x2  sinr  - 2x,cosr 

(A-90) 

- Aw  sin  r 

(A-91) 

Awcosr 

(A-92) 
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2 .  Euler -Lagrange  Equations 


X, 

. _ A x 
2 

(A-93) 

*2 

= 2\4$inr 

(A-94) 

= 2X4  COST 

(A-95) 

*4 

= K, » K*  * 0 

(A-96) 

n0  As  - 2 ( X 

, + X2  sinr  + XjCOsr  ) 

(A-97) 

n0  Ap  : — Xj 

cost  4*  X3  sinr 

(A-98) 

noAw  ~ “X5 

sinr  4 Xe  cosr 

(A-99) 

3.  Integrated  Euler -Lagrange  Equations 


X,  = T x4t 

X2  = Xao  - 2X4  COST 

\j  - X50  +2X4  sinr 
X4  = CONSTANT 


X8  = 


x*  s 


II 


(A-100) 

(A-101) 

(A-102) 

(A-103) 

(A-104) 

(A-105) 


4.  Boundary  Conditions 

A great  simplification  in  the  complexity  of  the  equations  can  he  achieved 
by  taking  advantage  of  the  symmetry  afforded  by  the  Lagrange  variables  Xg  and 
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x3 . Therefore, 

in  performing  the  integrations 

it  will  be 

convenient  to 

limits  -Tf/2  to 

Tf/2  for  the 

"in -plane"  state 

variables . 

Transfer 

Rendezvous 

State  Variable 

T=-i 

T 

T =-lL 

T = If- 

("in -plane") 

2 

2 

2 

2 

1 

A*|f4  1 

1 

A\  4 1 

*2 

*20 

*20+ A*2t 

*20 

*20  + A*2f 

*3 

*30 

*30  + A *3f 

*30 

*30+A*3t 

u 

*40 

FREE 

*40 

*40  + A*4f 

(out -of -plane) 

T = 0 

T = Tf 

T - 0 

T = T, 

*5 

0 

*Sf 

0 

*6 

0 

*6f 

0 

*6f 

5«  Integrated  Equations  of  State  (with  initial  conditions 


) 


A*i 


- 4X, 


Y ) + sior 


3X4(  r2- 


Ax2  - 


- 4^ ( cost- cos ) 4 -^p[  5(r  + ^ ) - 3(sinr  cost  4 ) j 

4 Xjot  sin2r  — sin2-^- ) - 2X4£  4 ( sinr  + sin  -p  ) - 3 (t  cost  4-  p COS -p  ) 


Ax,  = 4XW(  sinT  4 sin 4 X2Q(sin2T  — sin2-^- ) 

4-  5 (t+  + 3 ( sinT  cost  4 )] 

- zxj  4(cost  - ^p-)  4 3(  T COST  4 -g  cos  U)] 


(A-106) 


(A-107) 


(A-108) 
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Ax4  s XioI  3 (r+  y )*  -8  [ I - c°s  (t  + y )]  } 

+X20((r+--^  ) [ 5 cost  + 6cos  - ysin(r+^f)-  -y  sinr  - 8 sin  y } 

+4so{(r+  -Jf  )[6sin|-  - 5Sin  r ] + cos(r  +^-)-  -y  cosr  +8  cos  ^-} 
+X4  { I6(r  + -3-)  - 6rf[  I - cos (r  + y[- )]  + + \ T(%)  ~ T T* 

+ 2X3^  cosr  - cosy]  - Zx^fsinr  + sin  y ] 

x#  = ( r - sinr  cosr  ) - y*  sin2r 

X#  S - -y  sin*T  + y ( r + sinr  cosr  ) 

6 . Transversality  Conditions  - Transfer 

X4  = 0 
^5  s tonr 


7.  Constants  of  Integration 


Transfer 


\o  s 


yl  (5rf  + 3sinrf ) - 4Ax„siny 
if  (5rf  + 3sinrf)  - 16  ( I - cosrf) 


2 Ax2f 

Xz0’  5rt  - 3sinrf 

2[*fAx,f-  2Ax,f  sin-^-] 

X»°*  rf  ( 5rf  + 3 sinr,)  - 16(1- CO«rf ) 


(A-109) 

(A-110) 
(A-lll) ' 

(A-112) 
(A -113 ) 

(A-114) 

(A-115) 

(A-116) 
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*s»  ( Tf+  sinrf  cost-)  4-  x6,  sin2rf 
9 2(rf2- sin2rf) 

Rendezvous 

( 5r,  4-  3sinrf)  - 4Ax3,sin-5j- 
X|°  S r,  ( 5r,+ 3sinr, ) - I6(  I - cost,) 


(A-117) 


(A-118) 


X2os 


Tt  ( 5rf-  3sinrf)  ( j|r,24-  I)  - 2 ( 8 sin^j--3r,  cos  |_ 

4-  Ax2,[  r’+  8 r,  - 3r,(  I - cost,  ) - 8 sinr, ] 

- [3t, cos^j-  - 8sin4jr]  [ 2Ax,f  sin  ^ 4-  Ax«,  + Ax^sin  ^ ] 

2 [T,  Ax3f  - 2Ax„  sin-£-  ] 


■5*rfAx,f(3rf  cos-g--8  sin  -g- ) 

(A-119) 


'SO 


rt  ( 5rf  + 3sinrf ) - 16(1-  cosrf) 


(A-120) 


^4  = ' 


rf(5rf  — 3sinrf)  (‘^rf2  + 1 ) “ 2(8 sin ~ — 3*f  cos  -g-  )2 

^^llTfCOS*^  4-  3sin*j-  ( I - costf ) - 22 sin  -j-J 

4-  ( 5rf-  3 sinr,)  [ ^psin  ^ 4-  x^in  y ] j 


- -f|-TfAx1((5Tf-  3sinr,) 


(A-121) 


*5  = 


2 { Xjj  (rf  4-  sinrfcosrf)  4-  xgf  sin2rf  } 2i  [rf  sinii,4-  sinrf  sin(&f4-rf)  J 


2 . 2 
r,  - sm  t, 


Z . 2 
rf  - sin  rf 


(A -122) 


X«  = 


2 { x9,  sin2 r,  4-  x9,  ( r,  - sinr,  cost,  ) } 2 i [ rf  cos  ii,  - sinrf  cos(&,4-rf)  j 


2 . 2 
Tf  - sin  Tf 


2 . 2 
rf  - sin  rf 


(A-123) 
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8 . Controls 


II 

CO 

<x 

0 

c 

O 

/< 

CVJ 

— 3X|T  *+■ 

2\20sinr  +•  2X^0057 

(A -124) 

"oAR  s 

ro 

- XgQ  cos  r 

+ X^  sinr 

(A-125) 

n0Aw  = 

- *8 

sinr  + X6 

COST 

(A-126) 

9 

9 > Payoff 

Transfer 


n3  r 2 
no  ro 


( 5rf  + 3sinrf)  - 4 Ax,*  Ax3fsin  -tj- 
rf  ( 5rf  + 3sinrf ) - 16  ( I — cosrf ) 


A 2 A 2 

Tt  Ax3f  Ax2f 

+ 


5rf-3sinrf 


NOTE:  (l) 


(A-127) 


rf+  |sinrt| 


Rendezvous 


n03r02 


AX  |f  Tf  2 

— ( 5rt  + 3s1nrf)  -4Axtf  Ax3fsin-^-  + rf  Ax3f 
rf  ( 5rf  + 3sinrf)  — I6(  I — cosrf) 

-g- (5rf  - 3sinrt)  { 2 Ax2tcos^  - 2Ax3fsin^  - Ax4t  + rfAxtf  - Ax^in^-  } 

2 


rf  ( 5rt  - 3sinrf  ) ( j|-  rf2  + | ) - 2(3rtcos  -y  - 8 sin^f-  ) 

Ax^l  3rf  cos  - 8 sin  ^ ) { 2Ax2f  cos  ^ - 2 Ax31  sin  -Ax4f  + -|-rf  Ax,f  -^x^ysin  ^ } 
T| ( 5rf  - 3sinrf)(  rt2+  I ) — 2 ( 3 rf  cos-^-  - 8sin  ) 

^*2f2  ( i|  rf2+  1 } 


(A-128) 


rf  ( 5rf  - 3 sinrf ) ( T,  2 + I ) - 2 ( 3rf  cos  - 8 sin  ) 


P 


- sin  rf  cos  ( 2iif  + rf ) 


(rfz-  sin2  rf ) 


NOIE:  (l)  The  second  term  of  this  equation  is  incorrect  in  Ref.  1. 
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10.  The  optimal  values  for  changes  in  the  state  variables  X4  and  Q are  pre- 
dicted by  the  variational  analysis  in  the  case  of  orbit  transfer  where  the 
values  x4  and  Q are  left  open  at  the  final  time. 


Ax* 


3 Tf  Ti 

-4-  Tj  Ax - 2 A x3f  sin  -75 — 4x3osin-2- 


(a-129) 


(A-130) 


11.-  Payoff  Equations  with  an  Intermediate  Reference  Orbit 
Transfer 

Ax  2 

J — jp-  (5rf  + 3sinrt ) -4Ax|(  Ax3f  sin^f-  + rfAx3f2  Ax2t2 

nI3rX2  (5rf  + 3sinrf)  - l6(l-cosrf)  5rf-3sinrf 

.2  (A-131) 

I 

Tt  + 1 Sin  Tf  I 


Rendezvous 


Ax. 


7TT* 

ni  ri 


_ 2 

(5rf+ 3sinrf ) - 4Ax|f  Ax3f  sin-^-  + rfAxSf 
rf(  5rt+  3sinrf)  — i6(  I - cost, ) 


0 (5Tf_3sinrf)  {2  Ax2f  cos-jy-  - 2Ax3fsiny  - Ax4,  + -|-rf  ( xD  + xrt  - 2 ) - 4 Xjgsin  — } 

Tf  ( 5rf  - 3sinrf)  ( -j^-  rf  2 + I ) - 2 ( 3rf  cos  — - 8 sin  )* 

Ax2t(3rfCos^-8sin^  ) {2Ax2fcos-^-  -2Ax3fsin^--  A*4f  + x»+  *•»  " 2 >-4*30S'"  y} 

Tt  ( 5rf  - 3sinrf)(  -j^-rf  2 + I ) - 2(  3rf  cos  — - 8 sin  )2 


+ 


rfAx2f 


Tf  ( 5rf  - 3 sinrf ) ( rf  2 + | ) _ 


2 ( 3rf  cos  -Tjf- 


- 8 sin  ) 


+ i 


.2  r rf  — Sinrf  cos(  2flf+rf ) 


rf  2 - sin2rf 


(A-132) 
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12. 


Optimal  Transfer  Coordinates 


Ax4*  = !"rf(*io+  x if  ~ 2 ) -2Ax3fsin^-  - 4x30sin-^- 
4 Ax 


+ 5rf-3sinrf  ( J cos  ^ ( 5rf  - 3sinrf ) + 3rf  cos  - 8 sin  \ } 


(A-133) 


Q., 


(A -134) 
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APPENDIX  III 

SYNTHESIS  OF  THE  OPTIMAL  CONTROLS 


A.  Rotating  Coordinates 

1 . Control  Equations 


Ay  * 

dAy 

ay  y 

dA  dA  dA 

+ du  u + dv  v ' d*  * 

(A-135) 

dA_ 

dA,  dA.  dA. 

(A-136) 

A,  * 

X 

ay  y 

+ a„  “ + a,  v + — ' 

dA, 

dA, 

dr  Z 

(A-137) 

2.  Guidance  Coefficients  - Transfer 


dAy  12  T* 


dy 


( I - cost1  )(  29  - 27COST1) 


(A-138) 


dAy  24 

—fa  z ( I - cost')  ( II  sinr'  - 3r'  cost'  - 8r'  ) 


(A-139) 


dAy  12  , 

( 5t*  + 3r'  sinr'  cost'  - 8 sinZT'  ) 


(a-i4o) 


[ 70T'sinT*  -55r*  + I8r  sinT1  cost'-I-  3{  I — cost*)(  5 - 27cost*)]  (A-l4l) 

S -i-  [ 65 t’2  - 80 t'  SinT*  - 24 t' sinT1  cost'— ( I — cost')(25-  103  cost')]  (A-l42) 
du  V 1 


aA, 

dv 


24  O) 

( 8r  - II  sinr*  + 3r*co$r»)(  I - cost*  ) 

<*> 


(A-143) 


NOTE  : 


dA«  8 
dv  du 
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dAx  _ - 2 sin2r' 

dz  T,2-sin2r' 

(3 A,  _ — (2r'~  sin2r') 

dw  r'1  - tln*r' 


(a-144) 


(A-145) 


where 


<X>  = 480r*  - 75 r'3  — 240t'cost'  ( I + cost')  - 144  sinr'(  I - cost*)  - 21  3r'  sin2r'^ 


3 . Rendezvous 

Dye  to  the  length  and  complexity  of  the  synthesized,  in-plane,  control 
equations  for  rendezvous,  the  guidance  coefficients  are  not  written  explicitly 
here.  Instead  the  basic  equations  are  tabulated,  and  the  coefficients  calculated 
from  these  equations  are  plotted  in  Figs.  20  through  22. 


X 


ac0 

dx; 


r*  - 4 


dC, 

dx; 


COST'  — 4 


dC; 

dx: 


sinr' 


<*Ay  ( dc0 

dx;  ' \ dx, 


_dC 

d 


dC2  \ 
— sinr'  + -r — cost') 
X;  dx,  / 


(A-147) 


(A-148) 


A*  « 


(A -149) 


X 

K 

$\4 

+» 

t 

$« 

$14 

y 

$2, 

$22 

$24 

- 

$20 

y 

$22 

$24 

u 

K 

$32 

$.4 

Cl  - 

$30 

u 

$32 

$34 

V 

$« 

$42 

$44 

$40 

V 

$42 

$44 

(A-150)  0 (A-151) 
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where 


^0 

<*>„ 

^12- 

X 

4>to 

4*22 

y 

K 

**• 

K 

u 

4>< K> 

*4, 

4*42 

V 

(A-153) 


* 


4> 

Mo 

<f> 

M2 

*4 

4*20 

^21 

^22 

4*24 

4*V3 

4b 

^32 

4>* 4 

4*40 

4>4, 

^2 

^44 

(A-154) 

and 


^K>  S 

-L  T'»  _ 8t-  + 8sinr- 

II 

8 

•6- 

8(1-  cost')  - T,f 

II 

8(1-  cost*)  - 5r’sinT' 

^31  s 

5r'  cost'  - 3 sinr' 

4,2  * 

5r*cosr'  - II  sinr*  + 6t’ 

4,2  = 

Sr’sinr'  - 6(  1 - cos r1) 

' 

*4  S 

6(1-  cost*)  - r* 

4>  34* 

-|“r'  - 6 sinr' 

*^0  = 

4(1-  cost’)  - yr4 

4*40  = 

3r'  - 4 sinr' 

(A-155) 

4b  = 

-|-  [ T'cosr*  - sinr'] 

4«  - 

■|r  r*sinr' 

422  * 

~Y  r^inT4  -4(1-  cost4) 

4> 42  Z 

sinr'  — r'cosr1 

<f>  5 

*24 

3 ( r-  sinr*) 

4* 44 

-3(1-  COST') 
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dAx  - 2 sin2r* 

dz  t*8  — sin2r' 

aAz  _ -(  2r'-  sin  2r') 

dw  r'2  — sin2r' 


B . Lagrange  Variables 

1.  Control  Equations 


A»  = 


AS  s 


A 111  — 


aAR 

^AR  a 

aAR 

aA„ 

aAR 

T&S** 

aAx,A*5 

+ aAx4A*4 

■*"  <),  *30 

°*30 

dAS  A 

dk. 

aAs  A 

a A® 

3a7ia‘' 

+ 7E7*A,*  + 

TET, 

^Aw  A 
aAx9 

aAw 

a«« 

2.  Guidance  Coefficients  - Transfer 


dAs 


— 4 sinr*  sin 


d An,  r'(  5r*  + 3sinr')  - 16 ( I - cost') 
a Ar  2 cost' 


aAx2  5r'  - 3 sinr' 


aA0 


2r'sinT' 


dAx,  rt5r*+  3sinr‘)  - 16(1  - cost*) 


dA.  8 cosr'sin  - 4-(5r'  + 3$lnT*) 

— 8 . £ £ 


dAx 


r*(  5r*  +■  3sinr' ) - 16(1 -cost') 


aAs  4 sinr1 

dA*|  * 5r'-3*lnr' 


(A-156) 

(A-157) 

(A-158) 

(A-159) 

(A-160) 

(A-161) 

(A-162) 

(A-163) 

(a-164) 

(a -165 ) 
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dAw  cost'  sin2r* 

8Ax9  t'2-  sin2r' 

<3Aw  - -g-  cosr‘(  2r'-  sin2r*) 

dAx#  r1’2  - sin2r* 

dAs  4(2 sin  y - T*cosr') 

<iAxs  t*(5t'4*  3sinr')  - 16  ( I -cosrl 


3 . Guidance  Coefficients  - Rendezvous 

d AR  4 sinr' sin  r[  5r*  - 3 sinr*  + 2 cosr*(  3r*cos  ^ - 8 sin  •£  )] 

8Ax,  * Q B 


<?Ar  2t'cost'(  ^ r*  + I ) + cos  y ( 5r'-  3sinr*)  + 2(  3r'cos-^-  - 8sin~  )( I + cost1 
dA*t  9 


dAg_ 

dAxj 


- 2r>sinr' 

■+ 

0 


m 


t 


8AR  -g-  [5t'~  3sinr'  + 2cost'{  3t'cos-^-  - 8sin-^-  )] 
dAx4 ' i 


dAR  2 sin  Y [ 5t'-  3sinr‘  + 2costX  3t'cos  - 8 sin^  )] 
d*»  0 ' 

«?AS  (5r‘  + 3 sinr'- 16 sin y cost1) 

dAx,  0 

•j^r'[  3t'(5t'-  3sinr')  + Ssinr^  3t'cos-^  -8siny)] 

B 


(a-166) 

(A-167) 

(A-168) 

(A-169) 

cos-§') 

(A-170) 

(A-171) 

(A-172) 

(A-173) 

(A-174) 
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<JAS  4r'sinTl  r'2  4- 1 ) + y r'cos  ^ ( 5r*  - 3sinr) 


dAxz  B 


( 3 r'cos  - 8 sin  )(  3r'4-  4sinrtos  -|r  ) 


B 


dAs  4(  r'cosr'  - 2 sin  ) 


dAx 


* T sin"2  [ 5r'  — 3sinr')  4-  SsintH  3 r'cos  £ 

— — 


8sin^- )] 


3AS 

<?Ax4 


^-[  3r‘(5r'-3sinr')  + 8sinr'(  3r'cos-|-  - 8sin-jj  )] 


B 


dAe 


sin' 


£ [ 3sinr')  4-  8sinr’(  3r'cos-^  - 8sin  ^ )] 


dx 


SO 


B 


dAw  2sinr'(r' 4-  sin2r‘) 
dAx9  ’ r*-sin2r* 


dAw  2 [sinr*  4-cosr'(r'  — sin2r') 

dAx#  r^-sinV 


where 


Q = !6(l-cosr')  - t\  5r'  4-  3sinr') 


B * r'(  5r' - 3sinr')(  r^  4-  I ) - 2(  8 sin - 3r'cos-g- )2 


(A-175) 

(A-176) 

(A-177) 

(A-178) 

(A-179) 

(A-l80) 

(A-l8l) 

(A-182) 
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FIG.  4 


CIRCUMFERENTIAL  ACCELERATION 

CIRCLE  - TO  - CIRCLE  TRANSFER 


' 0 0.2  0.4  0.6  0.8  .0 


r/  rf 
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RADIAL  ACCELERATION 


FIG. 6 
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CIRCUMFERENTIAL  ACCELERATION 

CIRCLE -TO -CIRCLE  RENDEZVOUS 


FIG.  7 


203 


C - 910098-12 


FIG.  8 


NORMAL  ACCELERATION 

CIRCLE  -TO-  CIRCLE  RENDEZVOUS 

rf  = 60° 

OPTIMUM  TRANSFERS  : 150"  330° 


6 


4 


2 


-2 


- 4 
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OPTIMUM  EARTH -VENUS  TRANSFER 

UNINCLINED  CIRCULAR  TERMINAL  ORBITS 
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AN  APPROXIMATION  TO  LINEAR  BOUNDED 
PHASE  COORDINATE  CONTROL  PROBLEMS* 

E.  B.  Lee 


1 . Introduction 

In  many  control  problems  both  restraints  on  the  magnitudes 
of  the  control  variables  and  various  system  variables  may  occur. 
Certain  results  [1,2,7]  are  available  for  the  determination  of 
optimal  controllers  for  some  classes  of  Linear  and  nonlinear 
systems  involving  such  restraints.  These  results  take  the  form 
of  necessary  or  sufficient  conditions  for  optimal  control  but 
not  both,  and  are  therefore  only  a partial  solution  to  even 
the  theoretical  problem,  leaving  much  to  be  desired  in  the 
way  of  a practical  solution.  To  use  the  necessary  or  suffi- 
cient conditions  for  synthesizing  an  optimal  controller  it 
is  necessary  to  solve  a two-point  boundary  value  problem  in 
terms  of  a number  of  free  parameters  and  multipliers  where 
the  number  of  parameters  is  not  even  known  as  well  as  certain 
Jump  conditions  [2,7].  A backing  out  procedure  [9]  is  also 
available  if  one  is  interested  in  flooding  the  domain  of 
controllability  with  responses  and  then  keeping  track  (storing) 
of  the  corresponding  control  magnitude  for  each  such  point. 

We  here  offer  a procedure  which  has  several  advantages 
over  the  above  schemes,  but  is  only  an  approximate  solution. 

Its  main  advantage  is  that  no  discontinuities  will  be  encoun- 
tered in  the  adjoint  solution  which  determines  the  optimum 
controller  and  therefore  the  resulting  two  point  boundary 
value  problem  may  be  more  readily  solved.  The  results  provide 
both  necessary  and  sufficient  conditions,  as  well  as  existence. 


♦Prepared  under  contract  NASw-986  for  the  NASA. 


for  the  approximate  problem. 

The  analysis  is  limited  to  linear  control  processes  as 
described  by  the  differential  system 

Z)  x = A (t )x  + B(t)u(t). 

The  coefficient  matrices  A(t)  and  B(t)  are  composed  of  known 
continuous  functions  on  the  time  interval  [tQ,t^].  The  con- 
troller u(t)  is  to  be  chosen  from  a set  Q:|u^|£  1;  j = l,2,...m, 
so  as  to  steer  the  response,  x (t),  of  x)  from  an  initial 

U.  ~ 0 

point  xQ  at  time  tQ  to  a prescribed  compact  target  set  GCR 
and  it  is  required  that  xu(t)  remain  within  a given  constraint 
set.  A,  during  its  entire  response.  Here  Rn  is  the  n dimen- 
sional real  number  space. 

The  problem  of  time  optimal  control,  as  considered  in 
the  next  section,  is  to  find  a controller  u(t)  which  steers 
x^(t)  from  xq  to  GCA  in  minimum  time,  that  is,  minimizes 
C(u)  = t1  - tQ  with  x(t1)e  G and  xu(t)  e A,  tQ  < t < t-^’ 

Later,  in  section  4,  we  discuss  other  optimum  control  cost 
functionals . 

There  are  certain  difficulties  involved  when  one 
directly  solves  for  this  optimum  controller.  We  shall  there- 
fore be  content  with  solving  the  following  apparently . simpler 
problem:  Find  that  controller  u(t)  with  graph  in  fi  which 

steers  xu(t)  from  xQ  at  tQ  to  G at  t^  with  x°(t^)  £ B and 

t,  - t a minimum.  x°(t)  is  defined  below. 

10  u 

It  is  assumed  that  A is  a closed  convex  set,  (for 
convenience  we  could  even  let  A=  {x|x'H  x < c},  where  H is 
a positive  semi-definite  matrix  and  c = constant  > 0.)  Let 
F(x)  be  a convex  continuous  differentiable  function  which  is 
such  that 

F(x)  ^ 0 if  x i A 


0 


if  x e A 


225 


Then  define* 

X*(tx)  - J 1 P(xu(t))dt. 

x-(ti)  essentially  measures  the  excursions  of  the  response 
xu(t)  to  a controller  u(t)  outside  of  the  region  A during  the 
time  interval  [tQ,t1],  By  keeping  x®(t1)  small  theresponse 
x^(t)  is  restricted  to  stay  close  to  or  within  A.  The  above 
minimum  time  optimal  control  problem  is  approximately  solved 
by  finding  a controller  which  steers  ^(t)  “ (x°(t)xu(t))  from 
(0,xo)  to  G * {x°,x|x  e G,  0 x°  <3}  in  the  minimum  time 
interval  t^  - tQ  if  p > 0 is  sufficiently  small. 

In  the  next  section  we  give  necessary  and  sufficient 
conditions  for  this  approximation  problem  using  the  time 
optimal  criterion.  Section  3 contains  an  example  and  section 
4 is  a discussion  of  the  approximation  problem  fo t other  cost 
functionals . 

2 . The  necessary  and  sufficient  conditions  for  the  approximate  % 

linear  time  optimal  problems 

We  augment  the  system  x by  considering  the  equation  system 


tThere  is,  of  course,  some  question  as  to  whether  such  a 
function  'F(x)  exists  for  an  arbitrary  convex  set  A contained 
in  Rn.  We  now  cite  an  example  which  shows  that  there  are 
such  functions  in. a number  of  interesting  cases.  Suppose 
A - {xl,  x2,...xn| |x2f  < 1}.  Then  pick  P(x) 

= l/2(x2  - l)2  if  x2  > 1 

- 0 if  | x2 | < 1 

- 1/2 (x2  + l)2  if  x2  < -l 

Thus  if  only  one  coordinate  (or  a linear  combination)  is  res- 
tricted the  problem  is  easily  handled  as  in  the  example, 
where  P(x)  is  continuous  and  has  continuous  partial  derivatives. 
Other  A* s can  be  approximately  handled  as  in  the  example. 
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i)  x°  = p(x) 

x = A(t)x  + B(t)  u(t) 


obtained  from  i)  by  adding  the  equation  for  x°  with  x0(to)  = 
Here  A(t),  B(t)  are  bounded  and  continuous  on  EtQ,t1]  and 
F(x)  is  a convex  function  with  F(x)  = 0 for  x e A.  |~(x) 
is  assumed  to  exist  and  be  continuous  everywhere. 

The  set  of  attainability  K(t1)CRn+1  is  the  collection 


of  end  points  x^t^  of  responses  xu(t)  = (x°(t),  xu(t))  of 
£ which  initiate  at  (0,xQ)  at  time  t corresponding  to  all 
(Lebesque)  measurable  controllers  u(t)  which  are  such 
that  luJ(t)|  < 1 on  Et^tj.],  for  J = l,2...,m.  (Such 
controllers  are  referred  to  as  admissible  controllers . ) 

In  the  following  theorems  we  establish  various  proper- 
ties for  ^(t^)  and  dKft^)  as  required  in  synthesizing  optimal 
controllers. 


0. 


Theorem  1 Consider  the  above  system  x)  with  initial  point 
xQ,  restraint  set  Cl,  and  set  of  attainability  ^(t^. 

Then  Kft-^)  is  a nonempty  compact  subset  of  Rn+1  in  variables 
Tx~,x)  with  convex  lower  surface  (as  defined  below)  for  each 
t K ^ 00, 

O — 1 

Proof  ^(t^  is  nonempty  since  any  measurable  controller 
u(t)C  fi  gives  rise  to  an  end  point  xu(t1)e  Kft^.  K(t-L) 
is  compact  because  the  system  x)  satisfies  the  hypothesis 
of  the  existence  theorems  of  references  6,  and  8. 

The  lower  surface  of  K(t)  is  where  exterior  normal 

A A A 

n+1  vectors  q to  K(t)  at  points  of  dK(t)  have  their  first 

* A 

component  qQ  <.  0.  We  now  show  that  if  x^  and  x2  are  points 
of  K( t ^ ) then  the  point  y = + (l'-X)x2  = (y°,y), 

0 < X < 1,  is  such  that 
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y - x-(tx) 


and 

y°  > xj(ti). 

where  u(t)  = X u-^t)  + (l-X)  u?(t)  and  ux(t)  and  u2(t)  are 
such  that  xu  (t1>=xLard  =0^  (tj)  = x2 . The  convexity  of  the 
lower  surfac^  of  K(t1)  then  follows  because  in  order  for 
it  to  be  nonconvex  it  is  necessary  that  there  exist  two 
points  x^ , x^  on  this  lower  boundary,  with  the  property  that 
the  point  X x1  + (l-X)  x2  is  below  the  set  Kft^  for  some 
0 < X < 1,  which  will  then  be  impossible. 

With  u( t ) - X u1(t)  + (l-X)  u2(t)  we  find  that 

t 

x-(ti)  = i(t1)xQ  + J 1 g>(t1)$_1(s)B(s)u(s)ds 

to 

= X [5(t1)xQ  + JS  5(t1)®'1(s)B(s)u1(s)ds| 

"o 

+ (l-X)  [a(t1)x0  + J 1 $(t1)®"1(s)B(s)u2(s)ds| 

L fco 

= X xUl(tl)  + (1'X)  V^l*  = 

= X x1  + (l-X)  x2  = y 

where  g>(t)  is  the  fundamental  solution  matrix  of  z with 
g}(tQ)  =1.  We  also  calculate 

*5<h> = J-!1 

u 
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and  X x°  (t.. ) + (l-X)  x°  (t,)  for  comparison.  Since  F(x) 

U*»  JL  4 U.^  JL 

is  a convex  function  of  x it  follows  that  for  0 £ X < 1, 
F(x-(t))  = F(X  xu^(t)  + (l-X)a^(t))  < X F(xu  (t)) 

+ (l-X)  F(x  (t)) 

U2 

and  so 

xS(tl}  " f1  P(xa(t))dt  = f1  F(X  xu  (t)  + (l-X)  xu  (t))dt 
to  to  1 2 

< X f*1  F(x  (t))dt  + f1  (l-x)F(x  (t))dt  = y°  . 

Jt  U1  Jt  u2 


Q.E  ,D , 

We  will  now  consider  those  controllers  u(t)  on  [tQ,t1] 
which  steer  xu(t)  from  xQ  at  tQ  to  points  xx  contained  in 
the  lower  boundary  of  K(t1)  (written  dK^t.^) ) . Such  controllers 
will  be  called  extremal  and  they  will  play  a significant  part 
in  the  selection  of  optimal  controllers . 

Let  u(t)  e n on  tQ  < t < tjbe  an  admissible  controller 
for  the  convex  control  process 

i)  x°  = F(x) 

x = A( t ) x + B(t)u(t) 

with  initial  point  xQ  = (0,xQ)  at  tQ.  If  the  corresponding 
response  xu(t)  has  an  end  point  x(t1)edK“(t1) , then  u(t) 
is  called  an  extremal  control  and  xu(t)  an  extremal  response 
on  [tQ,t1]. 
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so  that 


n(t)B(t)u(t)  - Max  {n(t)B(t)u}  almost  always  on  [t^tj  . 

U€fi  — k 

/Proof:  Assume  u(t)  on  [t0,t!]  Is  extremal  and  so  steers 

x(t)  from  (0,xQ)  at  tQ  to  ^“(t^  at  t±.  Choose  T)^)  - 

m (tj  j'n(t^) ) to  he  a nonzero  vector  normal  to  n directed  into  the  t 

halfspace  defined  by  tt  which  does  not  meet  K^).  Note 

ti  ^ o.  Then  let  n(t)  with  T\(t, ) as  above  be  the  response  of  the 

adjoint  equation  corresponding  to  the  controller  u(t). 

The  controller  + u(t)  * sgn{^(t )B(t)}  defined  for 
t e [t  ,t,  ] is  admissible  and 

O 1 

Tj(t)B(t)u(t)  - Max  Cn(t)B(t)u) 
ueO 

on  Ct  ,^1. 

Let  be  an  interval  of  total  length  e > 0 contained 
in  *5  - [t0,t1]  whereon 

6 + n(t)B(t)u(t)  < Max  {q(t)B(t)u)  for  some  6 > 0.  * 

ueO 

For  given  6 > 0 consider  the  modified  controller 
u£(t)  - u(t)  on  <9  - t£ 

■ u(t)  on  t€, 

+ sgn  {}  - -1  if  {}  < 0 

- 0 if  {}  » 0 

- +1  if  {}  > 0 
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The  adjoint  response  “n(t)  = , r| (t ) ) corresponding 

to  a controller  u(t)  is  a row  n+1  vector  satisfying  the 
differential  system 

^ * _T1  A(t)  - t)q  (xu(t)) 

tj  = constant  0. 
o 

where  xu(t)  is  the  response  of  x)  corresponding  to  the  controller 

u(t).  Define  u(t)  on  Ct_ , tn 1 to  be  a maximal  controller  in 

case  there  exists  a nonvanishing  adjoint  response  rj(t), 

t»  S.  30  that  Ti(t)B(t)u(t)  = Max  {Tj(t)B(t)u}  a.e.  on[tQ,ti]. 

° uefi 

In  the  following  theorem  2 it  is  shown  that  extremal 
and  maximal  controllers  are  the  same. 

Theorem  2 Consider  the  convex  control  processt 

X)  x®  « F(x) 

x = A(t)x  + B(t)u(t) 

with  initial  point  xQ  = (°»x0)  at  time  tQ.  An  admissible 
controller  u(t)CD  on  LtQ , t^ J is  extremal  for  X if  and  only 
17  it  is  a maximal  controller,  that  is,  if  and  only  if  there 
is  a nonvanishing  adjoint  response  ~n(t)  of 

= -r)  A(t)  - t\q  (xu(t)) 

T)  m constant  ^ 0 
o 


tThe  necessary  portion  of  this  theorem  follows  from  L.  S. 
Pontryagin's  Maximum  Principle  (7).  For  completeness  the 
simple  arguments  to  establish  the  necessary  part  are 
presented . 
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and  calculate 


dn(t)*€ 

dt 


and 


= TJXe  + tpc£ 


aX 


ijx  + fjx,  where  xg  refers  to  a response  of  z 


bonding  t 
Integration  from  tQ  to  t^  yields 


corresponding  to  the  modified  controller  ug(t) 


t)((t)  x€( t x)  - i)(t)£€(to)  = J 1 [-tj  A(t)  + ||(x(t) )Jxg(t) 

to 

+ J 1 tj(t)[A(t)xe(t)  + B(t)u(t)]-  P(xg(t))dt 


and 

tj(t1)x(t1)  - Tj(tQ)  x(tQ)  = J 1 {[-n  A(t)  + ||  (x(t))]x(t) 

to 

+ rj(t)[A(t)x(t)  + B(t)u(t)]-  P(x(t) )|dt  for  ijo  = -1. 

Combining  terms  and  using  the  assumed  continuity  for  F and 
||  we  easily  find  that 

i(ti)x€(ti)  - ^(t1)x(t1)  >_  6 e + o(e)  for  e sufficiently 
small  where  o(e)  corresponds  to  terms  of  higher  than  first 
order  in  e,  and  therefore  for  e sufficiently  small 


tj(t)xg(t^)  -T|(t1)x(t1)  > 0,  contradicting  the  construction 
of  n(t1)  as  the  outward  normal  to  ^(t^)  at  x-^. 


4 
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Hence  there  exists  no  such  interval  xe , so 

Tj(t)B(t )u(t)  = Max  q(t)B(t)u  almost  everywhere  on  «9. 
ueQ 

Conversely,  assume  that  u(t)  and  corresponding  response 
Tj(t)  ^ 0 are  such  that 

Tj(t )B(t)u(t ) = Max  q(t)Bu 
uefl 

a.e.  on  <S  with  tj  ? 0.  Let  u(t)  he  any  controller  in  fi 
with  corresponding  response  x-(t).  If  we  calculate 

d^xu  and  dT*xu  as  above, 
dt  dt 

and  then  integrate  from  t to  t1  using  the  assumed  convexity 
of  F(x)  we  find  that 

n(t1)  x^t^  >_  x-(tx)  = ■nft-^w 

where  w is  any  point  of  K(t^).  Since  1 "H ( t ^ ) | ^ 0, 
and  \ £ the  above  inequality  implies  that  xu(t^)  is 
contained  in  the  lower  boundary  of  the  compact  set  KCt^) 
with  convex  lower  boundary  and  hence  u(t)  is  extremal 

QED. 

Theorem  2 indicates  that  to  stay  at  a lower  boundary 
point  we  must  continuously  steer  maximally  in  the  direction 
of  the  vector  q(t).  This  remark  is  summarized  as  a corollary. 
Corollary  2.1  Let  u(t)  on  be  an  extremal  controller 

for  I,  with  corresponding  response  xu(t)  and  adjoint  response 
T| ( t ) so  that, 

T)(t)B(t)u(t ) = Max  q(t)B(t)u 
uefl 
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a.e.  on  [t  ,t.].  Then  on  each  subinterval  [t  r]c  Ct  ,t- ], 
u(t)  is  also  an  extremal  controller  with  x (T)edK(r). 
Moreover  tut)  is  an. exterior  normal  to  K(t)  at  x(t). 

Proof  Replace  t.^  by  t in  the  proof  of  theorem  2 to  obtain 
that 

t|(t)  xu(t)  >_  t)(t)  X^(t)  = t|(t)  w(t) 

for  all  w(t)  in  K(t).  From  this  inequality  the  conclusion 
of  the  corollary  can  be  drawn. 

We  next  show  that  the  set  of  attainability  ^(t^ 
depends  continuously  on  the  parameter  t^. 

Define  the  distance  between  a point  p and  a compact 
set  G^CRn  to  be 

d(p>G, ) = Min  Jp-g| 

geG1 

and  define  the  distance  between 
G2C  Rn  to  be 

d(G2,G2)  = Max-faax  d(p1,G2), 

V°1 

Ip  I = t Ip1!. 

i-1 

The  set  K(t2)C  Rn+1  varies 
given  an  e > 0 there  exists  a 6 

d(K(t1),  K(t2))  < € 


two  compact  sets  G^,  and 

Max  d(p0,G, ) }.  Here 
p2s02  * ’ 


continuously  with  t2  if 
> 0 so  that  for  1 12— t^  I < 
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Lemma  1 Consider  the  system  x as  above  with  attainable  set 
Rn+1.  Then  Kft^)  varies  continuously  with  t1  < ». 

Proof  We  need  only  show  that  each  point  x(t1)  of  K(t1) 
is  close  to  some  point  x(tg)  of  K(t2)  and  conversely.  That 
is,  we  need  show  that  given  e > 0 there  exists  a 6 > 0 so 
that  when  |t1  - t2 1 < 6 there  exists  xC^)  e i(:t1)  such 
that  Ixft^)  - x(t2)  < e for  each  x(t2)  e K(tg)  and  con- 

versely. 

Let  u1(t)  be  an  admissible  controller  on  [tQ,t1+l]  and 
x-^t)  the  corresponding  response.  For  t^  J?2  t.^  +1 

calculate 

x^(t2)  - x®^)  - J 2 FU^tJJdt  - J 1 F(x1(t))dt 

to  *0 

and 

xx(t2)  - x1(t1)  = $(t2)  J 2 ^(s)"1  BCsJu-^sJds 

to 

- ^(tp)  J 1 *(s)“1[B(s)u1(s)]ds 
fco 

+ [♦(tg)  -*(t1)]  [J  1 ^(s)-1  B(s)u1(s)ds]. 

to 

So 

x®(t2)  - x®(t1)  = J 2 F(x1(t))dt 

tl 
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and 


x1(t2)  - x1(t1)  = 4»(t2)  f2  1>(s)"1u1(s)ds 

fcl 

+ [*(t2)  - <*>(t1)][J  1 ^(s)"1  B(s)u1(s)ds] 

Since  A(t)  is  bounded  and  continuous  on  [t0,t-L+l]  so  is 
<l>(t)  and  therefore  there  exists  a constant  C1  so  that 

Mt)|  < cx 

and 

|*(t)-1|  < C1  on  [to,t1+l]. 

Also  since  B(s)  has  bounded  continuous  elements  bj(t)  and 
u1(t)  is  bounded  and  measurable  there  exists  the  constant 
C2  so  that 

1 T1  4>(s)_1  B(s)u-,  (s)ds  | < C . Integration  is  a 
^ t 

continu8us  operation,  therefore,  given  an  e > 0 there  exists 
a 6 > 0 so  that 

\f  P(x1(t))dt|  < |, 
tl 

Ip  ^(s)"1  B(s)u1(s)ds|  < 
tl  2 

for  1 1 — t^ | < 6 < 1. 
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Hence 

l*l(t2)  - x1(t1)  I < 3 + C-j^  C2  = e 

for  jt2  - tj  < 6 < 1. 

The  other  way  we  consider  u1(t)  = u(t)  on  [tQ,t1l  where 
u(t)  steers  to  x(t1)  and  extend  it  to  ^tQ,t1+l]  by  letting 
u1(t)  = u(t1)  for  t € [t1,t-L+n.  The  above  calculation  is 
then  repeated  to  find  lx(t2)  - xCt^JJ  < e for  1 12~ 1 1 1 < 6 < 1 
and  so  KCt^  varies  continuously  with  t^. 

Theorem  3 Consider  the  system  j as  above  with  initial  data 
xn  = (0,x^),  compact  restraint  set  n,  and  set  of  attainability 
K(t-,).  Let  the  target  set  G = {x°,x|  0 < x°  < B,  x e tfl  where 

— 1 ■■  - — — — 1 - — — — — — ■ ——  ny—  i i ' i — . ■ a,  i.i  i.  ■ -i  i.  n—  ■ ' ■'  ■ i 

3 > 0 is  a constant  and  G is  a compact  set  of  R . Suppose 
G meets  the  interior  of  K(t1),  then  there  is  a 6 > 0 such  that  G 
meets  for  It  - t^J  < 6. 

Proof  Since  G meets  the  interior  of  K(t^),  there  is  a point 
p e(Gfllnt.  ^(t^))  and  a ball  neighborhood  N(p)  of  radius 

r > 0 contained  in  K(t, ) . Consider  the  hyperplane  x°  = p°-r/2 

n+1  ^ * 

of  R and  in  this  plane  pick  n+1  independent  points  x-^, 

x2...xn,  ^n+1  of  the  boundary  of  the  ball  N(p),  all  equally 

spaced.  Let  x^t),  x2(t) , . . ,xn(t) , xn+1(t)  be  responses 

of  x with  initial  data  xQ  = (0,xQ)  and  corresponding  to 

controllers  u1(t),  Ug(t) , . . .u  +^(t) , tQ  < t <_  t-^  +1,  which 

are  such  that  x^(t1)  = x^}  . . •xn+i(ti)  = xn+l*  Pick 

1 > 6 > 0 so  small  that  for  lt-t.^1  6 the  points  x1(t)  lie 

within  spheres  of  radius  r/10  of  the  points  . . .xn+1 . This 

being  possible  because  of  the  previous  lemma  1 . 
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Consider  the  convex  combination  of  controllers  u^(t)  = 

= X^u-^(t)  + ^u2 ( t ) • • *^"n+l^n+l ^ 1 ^i  ” = 1 

(Note  lu^l  £ l)  and  the  corresponding  responses  x^(t) 

of  £ with  initial  data  (0,xQ).  For  each  fixed  t,lt-t1l  £ 6 
these  response  end  points  x^(t)  sweep  out  a surface  section 
S which  lies  below  the  plane  x°  = p°  by  convexity,  above  or 
on  the  plane  x°  = 0 because  of  the  positive  nature  of  F and 
intersect  the  line  segment  {0  £ x°  <.p°>  x=p}  (see  proof 
of  theorem  l).  Hence  G meets  K(t)  for  It-t^J  < 6 < 1. 

We  now  consider  the  problem  of  existence  of  optimum 
controllers . 

Theorem  4 Consider  the  system  x as  above  with  compact  restraint 
set  a - {u|  lu1!  <1,  i=l,2 . . . ,i]CRm,  initial  point  (0,xQ)eRn+1 
at  time  tn  and  constant  compact  target  set  G = fx°,xlO  £ x6  <_  ft, 
xeGl  for  R > 0.  If  there  exists  an  admissible  controller  u(t)c  fl 
steering  xQ  to  G on  tQ  < t < t^  then  there  exists  an  optimum 
controller  (also  admissible)  steering  x to  O In  minimum  time 
duration  t*  - tQ. 

Proof  If  (0,xo)e  G then  t*  = tQ  and  optimum  control  is  not 
required.  So  assume  (0,xQ)^  G and  consider  the  set  of  attain- 
ability ^(t^)  for  t^  tQ.  Since  there  is  one  controller 
which  steers  (0,xQ)  to  G the'  set  ^(t.^)  meets  G for  some 
t,  > t . Define  t*  to  be  the  greatest  lower  bound  of  all 
times  t^  such  that  K(t^)  meets  G.  By  the  continuous  dependence 
of  fc(t^)  on  t^  the  set  of  times  for  which  K(t1)  meets  G is 
a closed  set  in  R1.  Hence  t*  is  the  first  time  K^)  meets 
G and  therefore  pick  as  the  optimum  controller  u*(t), 
tQ  £ t <_  t*,  a controller  which  steers  to 

K(t*)0a. 
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The  next  theorem  asserts  that  for  optimum  control  we  need  only 
consider  points  of  the  lower  boundary  of  the  set  of  attain- 
ability and  therefore  by  theorem  2 extremal  controllers. 

A sufficiency  condition  is  also  included. 

Theorem  5.  Consider  the  system  x as  above  with  compact 
rectangular  restraint  set  n,  initial  point  (0,xQ)  at  tQ 
and  compact  convex  target  set  G « {x?x|0  < x°  <_  3i  xeGj 
(3  > 0}.  Let  u*(t)  be  a minimal  time  optimal  controller 
steering  x*(t)  from  xQ  to  G.  Then  u*(t)  is  extremal , _that 
is,  there  exists  a nonvanishing  adjoint  response  q(t)  a 
= (r|o,-q ( t ) ) with  i)0  < 0 so  that 

n(t)B(t)u*(t)  - Max  {n(t)B(t)u} 

U€fl 

almost  always  on  [tQ,t*]  with  T)(t*)  an  outward  normal  of 
R(t*)  at  x*(t*)  on  dk(t* ) and  ~n  ( t* ) satisfies  the  transver- 
sality  condition,  namely,  n(t*)  is  normal  to  a supporting 
hyperplane  tt  of  G and  the  set  of  attainability  ft(t»)  which 
separates  K(t*)  from  G. 

Moreover,  if  for  each  point  [3]  xeQ  there  exists  a 
nonmaximal  controller  ^(t^fl  so  that  on  t < a the 

response  xy(t)  initiating  at  x - x{|(^0)  is  contained  in  G, 
then  when  u(t)  is  an  admissible  extremal  controller  steering 
xq  to  G by  means  of  a response  satisfying  the  transversality 
condition  it  is  an  optimum  controller. 

Proof  By  assumption  there  exists  a controller  steering  xQ 
to  G so  G meets  K(t*).  Suppose  G meets  the  interior  of 
K(t*).  This  is  impossible  because  then  G meets  the  interior 
of  K(t)  for  |t-t*|  < 6,  6 > 0,  by  theorem  3 and  this  contra- 
dicts the  optimality  of  the  controller.  Hence  dG  meets 
dft(t*)  so  that  the  optimum  controller  must  steer  to  dK(t*). 

We  must  show  that  it  steers  to  a lower  boundary  point  to  con- 
clude that  it  is  extremal.  This  follows  at  once  because 
K(t ) always  first  makes  contact  with  G at  a lower  boundary 
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point  as  can  be  seen  by  considering  how  the  compact  set 

£(t^)  with  convex  lower  surface  moves  with  respect  to  the 

set  G.  Thus  if  u*(t)  is  optimal  it  is  extremal  and  by  t 

theorem  2 there  exists  the  nonvanishing  adjoint  response 

T](t)  so  that 

t 

Tl(t)B(t)u*(t)  - Max  T|  ( t )B(  t )u 

uefl 

where  q(t*)  satisfies  the  transversality  condition  since  G 
and  the  lower  boundary  of  K(t*)  are  convex  they  can  be 
separated  by  a supporting  hyperplane  tt  and  we  choose  T|(t*)  to 
be  normal  to  tt  and  directed  into  the  half  space  containing  G. 

When  u(t)  is  an  admissible  extremal  controller  steering 
xq  to  G and  satisfying  the  transversality  condition  it 
must  be  an  optimum  controller  if  G has  the  property  that 
through  each  point  xeG  there  passes  a nonmaximal  response  * 

which  remains  forever  in  G.  This  follows  because  once  G and 
K(t)  come  together  the  interior  of  K(t)  has  a nonempty 
intersection  with  G so  that  the  transversality  condition 
can  only  be  satisfied  once  and  therefore*  there  is  only  one 
time,  namely  t*,  for  which  an  extremal  controller  can  steer 
to  G and  satisfy  the  transversality  condition.  Thus  any  such 
extremal  controller  satisfying  the  transversality  condition 
is  an  optimum  controller. 

Q.E.D. 

We  have  therefore  reduced  the  problem  of  finding  an 
optimum  controller  for  the  approximation  problem  to  that  of 
finding  a solution  to  the  two  point  boundary  value  problem 
as  given  by  the  2n+2  equations: 
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x°  =»  P(x) 

x =»  A(t)x  + B(t)  Max  {r)(t)B(t)u} 
f uefi 

n - -i  A(t)  - % gl  (x) 

* % - o (nc  < o) 

with  boundary  conditions  x(tQ)  = xQ,  x(t*)e  d G with  ^(t*) 
an  interior  normal  to  G at  x(t*). 

3)  An  Example  of  Approximate  Bounded  Phase  Coordinate  Time 
Optimal  Control 

We  shall  consider  a very  simple  example  to  illustrate  some 
of  the  theory  of  the  previous  section.  Consider  a simple 
mechanism  with  position  coordinate  x and  velocity  coordinate 

* y.  Suppose  it  is  desired  to  bring  the  mechanism  to  rest  by 
means  of  a thrust  force  u(t)  whose  magnitude  is  bidirectional 
but  limited  to  be  less  than  1 in  magnitude  and  suppose  the 
velocity  is  not  to  exceed  .6  in  magnitude.  That  is,  consider 
the  linear  system 

x-y 
y - u(t) 

< with  |u(t)|  <_  1,  A = {x,y  | | y | .6},  x(o)  = 10,  and  y(0)  = 0. 


■f 
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Pick  F(x,y)  - |(y  - |)2  f or  y > \ 

■ 0 for  | y | < ^ Y 

“ +|(y  + j)2  for  y < -| 

We  shall  later  determine  the  parameter  3 > 0 so  that  the  * 

strict  bound  on- y is  not  exceeded.  Problems  in  which  the 

bound  is  soft  are  more  easily,  handled  since  then  we  can 

generally  pick  p ahead  of  time  and  in  a straightforward  manner 

solve  the  two  point  boundary  value  problem.  Here  we  have 

picked  F(x,y)  so  that  we  are  constraining  the  response  even 

before  the  boundary  of  A is  exceeded  in  hopes  of  maintaining 

the  strict  bound  on  y.  To  solve  this  approximate  problem  f 

it  is  merely  required  that  we  find  a solution  of  the 

system: 

• ° , . * 
x - F(x,y) 

x « y 

y - Max  {tidu} 
uefl  * 

% “ 0 (’'o  < °) 

\ “ 0 

with  x°  (oj  - 0,  x(o)  » 10,  y (0)  - 0,  x0(t1)  < 3,  x^)  - 0, 
y ( t^ ) » o for  some  t^  > 0. 
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A simple  calculation  shows  that  picking  $ = .08, 
nc(o)  - -10,  n1(o)  = -l,  n2(o)  « -.55  provides  a time 
optimal  solution  for  this  problem.  A plot  of  this  response 
is  given  by  figure  1.  Note  in  this  problem  the  exact 
optimum  solution  was  obtained,  but  in  general  one  would  pick 
different  F(x,y)’s  to  get  better  approximations. 

4)  Remarks  on  the  approximate  bounded  phase  coordinate  problems 
with  integral  cost 

As  before  consider  the  linear  control  process 

l)  x = A(t)x  + B(t)u(t) 

satisfying  the  conditions  stated  at  the  beginning  of  section 
1.  As  a cost  functional  of  control  consider 

C(u)  - g(x(T))  + Jt  {f°(x,t)  + h°(u,t)}dt 

o 

where  T = fixed  time  > tQ  and  the  real  functions  f°(x,t)  and 
h°(u,t)  are  continuously  differentiable  and  f°(x,t)  is  a 
convex  function  of  x for  each  t. 

The  problem  of  optimal  control  is  to  pick  an  admissible 
controller  u(t)  on  [t  ,T]  so  that  the  response  x (t)  of  I 
moves  from  xQ  to  a target  set  R at  T,  (G  may  be  whole 
space)  and  minimizes  C(u)  with  the  entire  response  xu(t) 
contained  is  the  closed  convex  restraint  set  A. 

As  before  we  introduce  the  convex  differentiable  function 
F(x)  satisfying  the  conditions 

F(x)  >0  if  x i A 
= 0 if  x e A 

The  approximation  problem  is  obtained  by  adding  F(x) 
to  the  integrand  of  the  cost  functional  C(u)  to  obtain  a 
new  cost  functional 

T 

c,(u)  = g(x(T))  + L {f°(x,t)  + XF(x)  + h°(u,t)}dt 

A.  J °o 

T ' _ 

t (f°(x,t)  + h° (u,t))dt. 
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here  X ^ 0.  If  X is  sufficiently  large  then  one  would  expect 
that  the  contribution  from  the  term  X F(x)  can  be  small  only 
if  the  response  stays  near  A or  within  it.  The  approximation 
problem  is  to  find  that  controller  u(t)  which  minimizes 
C^(u)  and  steers  to  GCRn.  • 

We  shall  assume  that  h°(u,t)  is  convex  in  u for  each  t 
or  that  the  controller  is  bounded  and  h is  a positive  function 
of  u for  each  t.  In  either  case  the  previous  theory  can  be 
applied  after  slight  modification  by  noting  that  7°(x,t)  » 

= f°(x,t)  + X F(x)  is  a convex  function  of  x for  each  t since 
both  f°  and  F were  convex  functions  and  by  noting  the  contri- 
bution to  x° (T)  made  by  the  terms  h°(u,t).  That  is,  the 
problem  has  now  been  cast  as  one  which  is  covered  by  the 
sufficiency  results  of  reference  5 which  are  also  necessary 
[reference  7]  and  can  be  obtained  as  a slight  modification 
of  the  results  of  section  2. 
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NOTES  ON  THE  RESTRICTED  THREE  BODY  PROBLEM: 

APPROXIMATE  BEHAVIOR  OF  SOLUTIONS  NEAR  THE  COLLINEAR 

LAGRANGIAN  POINTS 

C.  C.  Conley 


Introduction 


The  purpose  of  these  remarks  is  to  .describe  in  some  detail  the 
geometry  of  solutions  of  the  restricted  three  body  problem  (as  viewed 
in  the  rotating  coordinate  system)  near  those  equilibrium  points  which 
are  collinear  with  the  two  positive  masses. 

We  deal  only  with  the  linearized  equations,  but  make  some  quali- 
tative observations  which  can  be  carried  over  without  difficulty  to  the 
nonlinear  equations  for  suitable  values  of  the  Jacobi  Constant. 

This  report  is  intended  to  be  the  first  in  a series  whose  ultimate 
aims  include  an  existence  proof  for  the  "periodic"  solutions  discovered 
numerically  by  M.  Davidson  [l].  Whether  or  not  this  can  be  accom- 
plished remains  to  be  seen,  but  it  does  seem  clear  that  a thorough 
understanding  of  the  behavior  of  orbits  near  the  equilibrium  point  will 
be  required.  More  will  be  said  about  this  question  in  later  reports. 

From  the  work  in  this  report  we  obtain  the  following  qualitative  pic- 
ture of  solutions  of  the  linearized  equations  for  values  of  the  "Jacobi 
Constant"  slightly  above  that  of  the  equilibrium  point. 

The  projections  of  orbits  into  the  configuration  space  are  constrained 
to  lie  in  the  region  R between  the  two  branches  of  a hyperbola  symmetric 
with  respect  to  the  line,  4,  joining  the  positive  mass  points,  which  line 
is  contained  in  R. 

We  will  generally  restrict  our  attention  to  the  portion  of  the  phase 
space  corresponding  to  a closed  interval  I of  i about  the  projection  of 
the  equilibrium  point.  Recalling  that  the  value  of  the  integral  is  fixed, 
we  will  see  that  this  portion  of  the  phase  space  is  homeomorphic  to 
S2  X I (S2  is  the  two-sphere)  and  so  may  be  viewed  as  the  space  between 
2-concentric  spheres  together  with  the  bounding  spheres. 
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If  I is  large  enough  we  will  see  there  there  is  exactly  one  closed 
orbit  in  this  portion  of  the  phase  space.  This  corresponds  to  one  of  the 
family  of  periodic  solutions  which  are  known  (by  a theorem  of  Lyapounov) 
to  exist  in  a neighborhood  of  the  equilibrium  point  even  for  the  nonlinear 
equations. 

There  are  four  "cylinders"  in  the  phase  space  which  abut  on  this 
periodic  orbit  and  which  are  invariant  under  the  flow.  Two  of  these  run 
to  the  outer  bounding  sphere  and  two  to  the  inner.  One  of  each  of  these 
two  pair  of  cylinders  corresponds  to  a family  of  solutions  which  is  asymp- 
totic to  the  periodic  solution  as  the  time  goes  to  +oo;  the  others  to  fami- 
lies asymptotic  as  time  goes  to  -oo.  These  cylinders  act  as  separatrices. 
They  separate  those  solutions  which  go  from  the  inner  to  the  outer  sphere 
(or  vice  versa)  from  those  that  do  not:  in  the  language  of  the  configura- 
tion space,  they  separate  those  solutions  which  make  a transit  of  the 
region  of  the  equilibrium  from  those  which  do  not  cross  this  region. 

(The  existence  of  such  cylinders  for  the  restricted  problem  is  apparent. 
From  a theorem  of  J.  Moser  [2]  it  can  be  seen  that  they  are  described  by 
real  analytic  functions  near  the  equilibrium  point.) 

The  projection  of  these  cylinders  into  the  configuration  space  covers 
the  union  of  two  infinite  strips  the  boundaries  of  which  are  the  envelop- 
ing lines  of  the  solutions  asymptotic  to  the  periodic  solution  (figure  1). 
These  four  enveloping  lines  (which  are  tangent  to  the  hyperbolas  bound- 
ing R as  well  as  to  the  periodic  orbit)  divide  R into  several  regions 
and  we  will  be  able  to  determine  the  nature  of  solutions  in  these  differ- 
ent regions.  Further  description  will  be  easier  to  give  later. 

An  amusing  result  is  that  exactly  one  solution  from  each  of  the  four 
cylinders  of  solutions  asymptotic  to  the  periodic  solution  has  a cusp 
(as  viewed  in  the  configuration  space).  A modification  of  this  statement 
holds  as  well  for  the  restricted  three  body  problem.  These  four  cusp 
points  determine  arcs  on  the  hyperbolas  bounding  R,  and  any  solution 
which  cusps  on  these  arcs  is  making  a transit  of  the  equilibrium  region. 

A statement  which  is  perhaps  a little  more  useful  is  that  there  are 
two  unique  solutions  which  are  "best"  for  making  a transit  of  the  equili- 
brium region  in  that  they  take  the  least  time.  One  of  the  (possible)  dif- 
ficulties in  using  orbits  which  correspond  to  the  solutions  of  M.  David- 
son is  the  amount  of  time  it  is  possible  to  spend  in  the  region  of  the 
equilibrium.  * It  may  be  useful  to  have  a simple  criterion  for  decreasing 


The  values  of  the  Jacobi  Constant  considered  here  are  small  relative 
to  the  ones  usually  considered. 
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FIGURE  1.  SOLUTIONS  WITH  VELOCITY  VECTOR  IN  SHADED  WEDGES 
GO  ACROSS  THE  EQUILIBRIUM  REGION 


this  time.  An  approximate  means  to  determine  the  "best"  orbit  is  given 
in  statement  eleven;  a more  accurate  one  could  be  derived  using  the 
result  of  J.  Moser  [2]. 

As  stated  above,  these  remarks  have  been  collected  primarily  with 
a view  to  later  applications.  However,  it  is  hoped  they  are  of  some 
value  in  themselves  in  gaining  insight  into  the  nature  of  solutions  of 
the  restricted  three  body  problem. 


1.  The  Equations 

Without  going  through  the  arguments,  we  can  state  that  the  linear- 
ized equation  near  the  equilibrium  points  in  which  we  are  presently 
interested  form  a hamiltonian  system  with  Hamiltonian  function: 

(1)  H(xi,x2,yi,y2)  = ^ {fri  -«x2)2  + (y2  +wxj)2  - ax2  + bx22} 

feo,  a,  b are  positive  constants) 

The  equations  are 


(2) 


x = Hy 
y = -Hx. 


In  these  equations,  co  is  the  frequency  of  rotation  of  the  coordinate 
system;  we  assume  <*>  is  positive. 

The  constants  a,  b will  be  arbitrary  positive  constants  in  our  dis- 
cussion. In  the  case  of  the  equilibrium  point  between  the  two  positive 
masses  of  the  restricted  problem,  a = 2b.  * If  the  mass  ratio  is  that  of 
the  Earth  and  Moon,  then  with  w = 1,  a is  slightly  larger  than  8. 


We  introduce  the  following  notation: 


(3) 


u = (x1,x2,y1,y2) 


,-a  0 . 

s = ( 0 b); 


J = ( 


-1  0 


); 


i = (!  °). 

1 ! o i ' ’ 


This  statement  is  also  true  of  the  other  two  equilibria  considered, 
however,  the  next  is  not. 


% 
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(3)(cont. ) 


7 = 


'-i 


s 


. w2 1 + S wj. 

1 -coj  I* 


Our  equations  are  then  written  as 
H (C[  ) = 7 ( u,  s Cl) 

(4) 

Ct  = Jh-  = is  Ct. 

Now  to  make  the  computations  easier  we  introduce  the  non-canonical 
transformation 


u = Au 


A 


W 


(5) 


The  equations  then  transform  to: 


u = Bu 

B = A_1^SA  = ( 


0 

-S 


-2coJ 


and  the  integral  is  given  by 


H(u)  = H(Aif)  = f (u,  Bu) 


E = A SA  = ( 


(6) 


(7) 


If  we  now  write  u = (x1,x2  , Zi,  z2  ),  the  equations  above  give  xt  = z2  . 
Thus  if  we  consider  projections  of  orbits  in  the  x- plane,  z = (zlf  z2 ) cor- 
responds to  the  tangent  vector. 

In  this  notation  we  have  for  the  integral: 


2.  The  Phase  Space 


We  will  be  primarily  interested  in  those  orbits  for  which 
(8)  H = h > 0, 

and  will  describe  the  projections  of  these  orbits  in  the  x-plane. 

Statement  1.  a)  For  H = h,  the  projected  orbits  arc  constrained  to  move 
in  the  region  R given  by 

R:  -a  x?  + b x22  • < h. 

b)  If  h > 0 R is  a connected  region,  otherwise  it  has  two  components. 

c)  If  h > 0,  the  phase  space  is  homeomorphic  to  S2  X E'  (S2  is  the 
2-sphere,  E'  the  real  line).  We  will  be  most  interested  in  that  part  of  the 
phase  space  for  which  Ixx  l £ c > 0.  This  region  can  be  considered  as  the 
space  between  two  concentric  spheres  including  the  boundaries. 

Proof;  Only  part  c)  needs  comment.  To  see  this  statement,  consider  the 
line  xx  = ci.  On  this  line  we  have 

z 2 + z22  + b x22  = 2h  + acx 

So  the  corresponding  points  in  the  phase  space  form  a 2-sphere.  The  rest 
follows. 

3.  Computations 

Statement  2.  a)  The  matrix  B has  one  pair  of  real  eigenvalues  and  one 
pair  of  imaginary  eigenvalues.  These  we  denote  by 

± jjl,  +_  iv  where  p,  v >0. 

b)  The  corresponding  eigenvectors  can  be  chosen  to  be: 


“H* 

i v 

-iv 

/1\ 

/ M 

/ 

cr 

I -cr  1 

iT  1 

- i _iT  I 

Vi  = 

V2  = u 

Wj  = 

ivi 

w2=wx=  _.v 

[\xr, 

\ p<rj 

“ VT  / 

V — VT  / 
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where  <r  and.  t are  real,  <r  > Oj  t < 0 (cf.  e)  of  this  Statement) 

c)  The  general  solution  is  of  the  form 

u(t)  = arie^Vx  + a2e  ^2  + 2Re  (|3eiVtwx) 
where  ax,a2  are  real,  p is  complex. 

d)  The  value  of  the  integral  on  the  solution  is 

“ (u(t),  E u(t))  * a^zei  + lpl2e2 

where 

ex  * (vj,  E v2 ) 
e2  = (wj , E w2  ) 

(Note:  the  inner  product  is  the  real  one  even  when  vectors  are  complex. ) 

e)  The  constants  n>  v,  * , T>  ei>  ez  satisfy: 

1)  a - 2uXT|j.  = jj.2 ; in  particular,  ^ > 0 

2)  -txr  + 2cjp  = |i2<r 

3)  a + 2wtv  « - 1/  ; in  particular,  t < 0 

4)  -fc>T  + 2wv  = -v2t 

5)  (vx  E vx)  = -a  + bcr2  + ji2  + <r2p.2  = 0 

6)  (wx  E wx)  = -a  - br2  - v2  + v2r2  = 0 

7)  (vx  E wj)  = -a  + ibr«r  + i(xv  - o-(xtv  = 0 

8)  ex  = (vx  E v2 ) = -a -bo-2  - jjl2  + cr2^2 

= -2(b«r2  + n2  ) < 0 

9)  e2  = (wx  E w2  ) = - a + bT  2 + v2  + v2t 

= 2(br  2 + v2)  >0 
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10)  2ab{0-2  + T2)  = e2(a  + bo-2) 


11)  o>tv  = a ; bro-  = - pi 


fi2V2 


ab;  -to-  = J 


(from  7)) 


Proof  of  Statement  2 : (Recall  B **  (_g  _2WJ^ 

To  prove  parts  a)  and  b)  and  equations  1)  - 4)  of  e),  we  first  observe 
that  any  eigenvector  must  have  a non-zero  first  component  which  we  can 
take  to  be  1.  The  form  of  B then  forces  the  eigenvector  to  be 
u *{l,  p,  X,  p\}  where  X is  the  eigenvalue.  Now  the  last  two  equations 
in  the  system  Au  = Xu  require  that 

a — 2wXp  — X2 

-bp  + 2wX  * X2P 

Elimination  of  p gives 

X4  + (b-a  + 4w2  )X2  - ab  * 0 

and  parts  a)  and  b)  as  well  as  the  first  two  equations  of  part  e)  follow. 

Part  c)  needs  no  comment. 

Parts  d)  and  e)  follow  from  general  considerations: 

Lemma  1.  Let  v and  w be  eigenvectors  of  the  matrix  J 2 where  2 
is  symmetric  and  J is  skew  symmetric  and  orthogonal,  and  let  the  cor- 
responding eigenvalues  be  X and  p.  respectively. 

Then  either 

(v,  2 w)  = 0, 
or 

X + p « 0 

Proof:  Since  i is  orthogonal, 

(v,  2 w)  = (4v,  / 2 w)  =*  \i-0v,  w) 

(2v,  w)  * (/ 2 V,  j/w)  as  X(v,  H w). 
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The  result  follows  by  the  symmetry  of  2 and  skew  symmetry  of  J . 

To  apply  this  lemma  to  our  problem  we  use  the  fact  that,  since 
B = A 2 A,  the  vectors  Avx  and  Awx  are  eigenvectors  of  J 2.  (The 
notation  is  that  of  §1. ) 

Part  d)  and  equations  5)  through  9)  of  e)  now  follow.  The  remaining 
equations  and  statements  in  e)  are  proved  with  a little  algebra.  The 
harder  ones  will  be  seen  geometrically  later  so  the  computations  are 
omitted. 


Statement  3.  If  u(t)  is  a solution  such  that  u(0)  = (xx,  x2 , zx,  z2  ), 
then  the  constants  or,  (3  (Statement  2,  c)  are  given  by.: 

exarx  = - axx  - twx2  - p.zx  + pvz2  = (u,  E v2) 

exor2  * - axx  + b<rx2  + pzx  + (i<r z2  = (u,  E vx) 

e2|3  = - axx  - ibTX2  - iv  zx  - vtz2  « (u,  Ew2) 

Proof:  This  follows  on  dotting  the  equation 

u(0)  * axvx  + a2v2  + pwx  + Pw2 

with 

Ev2,  Evx,  Ew2  respectively,  and  using  5)  - 9J"  of  Statement  1). 

Statement  4.  (Recall  that 

H(u)  = ~ {zx  + z2  - axx  + bx2  } 

= axar2ex  + I P 1 2 e2 

where  ex  < 0 ; e2  > 0). 

Consider  the  projection  in  the  x-plane  of  solutions  in  the  integral 
surface 

H(u)  a h > 0 

The  solutions  in  the  integral  surface  divide  into  classes  as  follows: 
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1)  The  (unique)  periodic  solution:  <*i  * a2  * 0 

2)  Solutions  which  are  asymptotic  to  the  periodic  solution  as  t -*°o 


cm  - 0 (a2  a 0) 

3)  Solutions  whose  Xj  component  tends  to  + oo  (-  oo)  as  t-^+oo: 

al,aZ  > 0 < 0). 

These  are  solutions  whose  projected  orbits  in  the  x-space  lie 

in  a half  space  X!  > c or  Xj  < c . They  do  not  make  a "transit"  of  the 

equilibrium  region. 

4)  Solutions  whose  Xi  component  goes  from  -oo  to  + oo  (+  oo  to 
-oo ) as  t goes  from  -oo  to  +oo: 

«!  > 0,  a2  < 0 (<*!  <0;  a2  > 0). 

These  are  the  solutions  which  do  cross  the  equilibrium  region. 


Proof:  By  inspection  of  the  corresponding  general  solution. 

We  are  particularly  interested  in  the  solutions  of  class  4)  which,  in 
the  case  of  the  equilibrium  between  the  two  positive  mass  points,  can 
be  interpreted  as  solutions  going  from  the  earth  side  of  the  equilibrium 
to  the  moon  side  (or  vice  versa).  Clearly  the  most  "efficient"  (least  time 
expenditure)  such  orbit  is  that  for  which  |3  = 0 since  the  "(3-portion"  of 
a solution  contributes  only  useless  oscillation  — we  will  come  back  to  this 
point  later. 

Interpretation  for  restricted  Problem: 

Solution  1)  corresponds  of  course  to  the  periodic  solution  about  the 
equilibrium  point  of  the  restricted  problem  whose  existence  is  guaranteed 
by  a theorem  of  Lyapounov. 

The  solutions  of  2)  correspond  to  the  four  families  which  are  asymp- 
totic to  the  periodic  solution  as  described  in  the  introduction.  Since  the 
argument  of  (3  is  free  and  can  vary  on  a "circle,  " these  four  families 
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are  easily  seen  to  be  "cylinders"  which  abut  on  the  periodic  solution. 

One  can  now  check  that  two  of  these  cylinders  "go  to  +oo"  and  two 
"to  -oo"  (as  t-*  +oo ),  i.  e. , the  region  of  the  earth  (say)  or  the  moon  (resp.  ) 
» Again,  one  easily  checks  that  one  of  each  of  these  pairs  is  asymptotic  to 

the  periodic  solution  as  t goes  to  +oo,  the  other  as  t goes  to  -oo. 

The  solutions  of  3)  are  those  which  enter  the  region  of  the  equilibrium 
only  to  return  whence  they  came  while  those  of  4)  make  the  transit. 

While  we  have  considered  only  the  linearized  equations,  simple  con- 
siderations ensure  the  same  qualitative  picture  for  the  equations  of  the 
restricted  problem. 


Statement  5 . If  Xi2  > — -l1-  i = c2 

ajx 

then  a)  x1z1  > 0 =>  a1x1  > 0 

b)  XjZx  < 0 =>  a2x1  > 0 

Interpretation:  If  a solution  crosses  the  line  xt  = cx  going  away  from  the 
origin,  then  if  Cj  > c,  the  xi  component  of  this  solution  must  tend  to 
+oo.  If  a solution  crosses  the  line  coming  toward  the  origin,  it's  x2  com- 
ponent goes  to  +oo  as  t -*■  -oo.  Corresponding  statements  hold  if 
xi  = c1  < -c. 

In  particular,  a solution  of  class  2)  or  4)  (oria2  < 0)  can  cross  the 
line  Xj  = c,  only  once  and  must  do  so  with  zx  # 0.  We  will  make  use  of 
this  remark  later. 

Also,  we  can  see  that  a solution  crosses  both  of  the  lines  x2  = + cx 
if  and  only  if  a,a2  < 0.  This  comment  allows  us  to  give  a precise  geometric 
meaning  to  the  statement  that  "a  solution  makes  a transit  of  the  equilibrium 
region.  " A similar  definition  works  for  the  restricted  problem  for  the  same 
reason. 

Proof  of  Statement  5 . 

a)  We  have  (Statement  3 ) 


e^j  = - axx  - b«rx2  - [izi  + H-trz2> 


where  ex  < 0 (Statement  2),  e),  9) 
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Thus 


sgn  xxc*!  = sgn(axi  + b«rxjx2  + pzxXx  - p<rxxz2) 

Since  xjZj  > 0,  we  need  only  show  that 

. axj2  > IborxxXz  - |i.<rx1z2l 

We  estimate  (Schwarz) 

1 b<rxiX2  - ix<rxxz2|  < |xxl(b(r2  + p2cr2  }2  (bx2  + z2  ) 2 
Using  Statement  2,  e)  1)  and  the  energy  integral  we  have 

a - ji2  = b<r2  + |j.2o-2 
bx2  + z2  ,<  2h  + a x2 

so  that 

JL  JL 

lbo-xxX2  - po-xiX2  I < Ux  I (a  - (x2)2(  h + a xj2  )2 

1. 

= Ixx  1 (a2x2  + 2ha  - 2h|j.2  - an2Xx2  ) 2 

This  last  quantity  is  less  than  axx2  provided  2ha  - 2hp2  - an2Xx2  < 0 
which  is  the  hypothesis.  A similar  proof  holds  for  part  b). 


A statement  stronger  than  the  above  can  be  proved  if  we  place  a 
restriction  on  the  constants  a and  w:  Namely 


Statement  6.  Recall  the  equations  are  given  by 
Xx  = Zx  ; Zx  = - 2o>z2  + axx 
x2  = z2  z2  = 2wzx  - bx2. 


If 


xi 


8ufh 
- 4w2a 


) 
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Then  zx  > 0 implies  the  corresponding  solution  never  returns  to  the 
line  xx  = Cj.  (Also  Xj(t)  "*■«>)  Furthermore  if  zx  = 0,  cI  is  an  ab- 
solute minimum  for  xx . A similar  statement  holds  if 


Xj  = cx 


I 8o)2h 
< " «/  a2  - 4wsa 


Proof:  The  proof  consist  of  showing  that  zj  > 0 under  the  above  circum- 
stances. We  have: 

l z2  I < n/  2h  + axj2 


so  that 

4w3z22  < 4o?  ( 2h+axt2  ) < a 2 x 2 

The  last  inequality  being  the  hypothesis.  The  result  now  follows. 
This  statement  has  no  force  unless 

a2  - 4w2a  >0 

which  situation  does  however  hold  for  the  equilibrium  point  of  the 
restricted  problem  between  the  two  positive  masses,  (a  > 8;  w = 1). 

Geometrically,  we  see  from  Statement  6 that  the  points  where  the 
X!  component  of  a solution  can  have  a maximum  must  lie  to  the  left  of 
the  line  Xj  = . Such  a restriction  is  valid  only  when 

a2  - 4'— 3 a >0  as  can  easily  be  seen.  This  remark  will  be  useful  in  a later 
report. 


Statement  7 


The  projection  of  the  periodic  solution  in  the  x- plane  is  an  ellipse 
with  minor  axis  of  length  in  the  direction  of  the  x1-axis  and 

major  axis  of  length  - ZrJ-h-  in  the  direction  of  the  x2 -axis. 

Proof:  (Assume  (3  is  real. ) The  projection  is  given  by 

XV  t 

x1  (t)  - 2 Re(pe  ) = 2j3cos  vt 

iv  t 

x2  (t)  = - 2t  Im(Pe  ) = - 2t  p sin  vt 
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Also  the  energy  integral  gives 

p2  e2  = h 

That  I t 1 > 1 follows  from  Statement  2,  e),  6): 


The  result  follows. 

Statement  8.  (Recall  the  solutions  with  aia2  = 0 are  those  asymptotic 
to  the  periodic  solution. ) 

a)  The  envelopes  of  projections  in  x-space  of  orbits  with  = 0 
are  the  straight  lines 

x2  = - <rXl  + (a  - 

= - orxj  +2 + t2)2. 

The  corresponding  envelopes  for  az  = 0 are: 
x2  = <rXl  + ( a - <r2  b ) • 

b)  All  four  of  these  lines  are  tangent  to  the  boundaries  of  R (i.  e. , 
of  the  region  of  x-space  wherein  solutions  must  move  — see  Statement  1. ) 

c)  The  points  of  tangency  lie  on  the  lines 

X,  . tjy^bh(Y-b^--T^2h(a-b,M 
(See  figure  1) 

Proof  of  Statement  8 

a)  If  <*x  =0  we  have  (Statement  2) 
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Xj  = a2e  ^ + 2 Re(Pei  t) 
x2  = - <r«2e  ^ - 2t  Im(|3eiVt) 

iv+  — i v t 

= -(rXi  + 2«rRe  (|3e  ) - 2t  Im(Pe  ) 

The  extreme  values  of  x2  for  fixed  x2  are  obtained  by  varing  arg  p. 
These  are  computed  to  be 

JL 

x2  = -crxi  + 2 Ip  1 (<r2  + t2  )2, 

Finally,  we  have  Ip  1 = V—  from  the  energy  integral  which  gives  one 

of  the  alternate  expressions  in  a).  Observe  that  the  extreme  values  are 
achieved. 

b)  We  could  prove  b)  by  computation;  however,  the  following 
geometric  argument  carries  over  to  the  corresponding  statement  (that 
"envelopes  of  solutions  asymptotic  to  the  periodic  solution  touch  the 
boundaries  of  R")  for  the  restricted  problem: 

We  first  observe  that  we  can  obtain  a space  homeomorphic  to  the 
phase  space  as  follows:  First  deform  R to  an  infinite  strip  (i.  e. , 
squeeze  the  boundaries  down  to  straight  lines).  Noting  that  at  each 
point  of  R (except  the  boundaries)  there  is  a "circle"  of  possible  vel- 
ocities (i.  e. , z 2 + z2  = const  > 0)  we  cross  the  infinite  strip  with 
a circle  to  obtain  a "pipe"i.  e. , the  space  between  two  coaxial  cylinders. 

The  length  along  the  cylinder  corresponds  to  the  Xj  coordinate. 

For  each  fixed  xx  there  corresponds  an  annulus  of  points;  the  radial 
variable  in  this  annulus  corresponds  to  x2 , while  the  angular  variable 
corresponds  to  the  direction  of  the  velocity  vector  z = (zj,  z2 ).  The 
inner  and  outer  boundaries  of  the  annulus  correspond  to  boundary  points 
of  R.  These  boundaries  should  be  identified  to  (different)  points  since 
Zj2  + z2  is  zero  on  the  boundary  of  R;  however,  we  neglect  this  point 
for  the  moment. 

Now  consider  the  "cylinder"  of  solutions  with  ax  = 0 say.  For 
fixed  Xx,  the  corresponding  points  on  the  cylinder  make  a closed 
loop  in  the  pipe. 

Now  if  Xx  > c (Statement  5),  and  c*x  = 0,  then  zx  < 0.  Thus  the 
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corresponding  "circle"  does  not  go  around  the  hole  in  the  pipe.  On  the 
other  hand,  the  periodic  orbit  does  encircle  the  hole  since  the  velocity 
vector  on  this  orbit  goes  through  all  angles.  Since  the  cylinder  abuts 
on  this  periodic  orbit,  some  section  of  it  must  enclose  the  hole.  It  fol- 
lows that  this  cylinder  must  cross  one  of  the  bounding  cylinders  of  the 
pipe. 

This  implies  that  some  orbit  with  = 0 must  touch  the  boundary 
of  R and  so  the  envelopes  of  these  orbits  must  cut  this  boundary. 
However,  they  cannot  go  out  of  the  region  R,  and  therefore  are  tangent 
to  the  boundary. 

Part  c)  (and  alternate  expression  in  part  a» 

From  parts  a)  and  b)  it  follows  that,  for  example,  the  equations 


a Xj2  - b x22  + 2h  = 0 
x2  = - <r  xi  + 2^  (<r2  + t2  ) 2 

have  a unique  solution  for  Xj . 

This  means  the  following  quadratic  equation  has  double  roots: 

(a  - b<r)x?  + 4b <r  *J—  (<r2  + r2  )2  - 4b  — («r2  + t2  ) + 2h  * 0. 
e2  e2 

The  condition  for  a double  root  is: 


4b2 <r2  — (<r2  + t2)  = (a  - bv2)  {2h  - (<r2  + t2)} 

e2  &2 


which  reduces  to  the  equation: 


(°-2  + t2) 


e2  (a  - b<r2) 


2ab 


This  equation  (which  is  Statement  2,  e),  10})  could  of  course  be 
verified  algebraically  from  the  other  equations  of  e);  the  algebra  is  left 
out  since  the  geometric  proof  suffices. 

The  remaining  computations  are  now  easily  completed  and  similar 
arguments  complete  the  proof  of  Statement  8. 
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The  following  statement  enables  us  to  give  a fairly  clear  picture  of 
the  approximate  location  of  those  orbits  which  make  a transit  of  the  region 
near  the  equilibrium  point  (c»r1Qf2  < 0).  This  picture  carries  over  to  the 
restricted  problem  with  little  difficulty  and  suggests  a "possible"  means 
of  giving  an  existence  proof  for  the  periodic  orbits  of  M.  Davidson. 
(However,  the  present  author  has  not  been  able  to  carry  out  any  proof  as 
yet. ) 

Before  giving  this  statement,  we  state  a lemma.  In  the  lemma, 
cos-l(y)  denotes  that  angle  between  .0  and  ir  whose  cosine  is  y: 
(provided  lyl  < 1) 

Lemma: 

a cos0  + (3  sin0  > y <*=> 

where 

cos  X ~ a; 

the  equality  signs  hold  simultaneously.  If  y2  >az  + |32  the  inequality 
never  holds. 


lx  ~ 6 l < cos 


-i 


(<*2+P2) 


2i2 


sin  X ~ P 


Statement  9 

Let  Zj  * p cos0  ; z2  * p sin0. 

Let  x = (xj , x2  ) denote  any  poi 

Tr  axi  + bcr  x2  ax 

If  ^ — ■'  Y2  * — 


R. 

<rx2 


cos  xx  ~ 1 


COS  X2  ~ 1 


sin  xx  ~ -<r 


sin  X2  ~ o' 


a)  Then  for  1 y2  1 .<  1,  we  haver: 

a . >_  0 <=>  1 0 - x . 1 < cos  * 


(1  + o-2  )i  • 


b)  It  follows  (Statement  8)  that  1 y^  < (1  + cr2  )2  only  in  the  strip 
between  the  lines  enveloping  the  orbits  with  aj  = 0 and  that  ly^l  = (1  + cr2 ) 
on  the  boundary  of  these  strips. 
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Proof  of  Statement  9 


From  Statement  2,  we  have 

|ex  lax  = a xx  + b<rx2  + m-z i ~ H0-2^ 
lej  |a2  = axx  - bcrx2  - p.Zx  - fJio'z2 


Replacing  zx  by  p cosG  and  z2  by  p sine  we  set 

(axx  + b<rx2) 

a i > 0 cosG  + <r  smG  > * 

* — 1 1 n 


up 


and 


. „ (axx  - bcrx2  ) 

a2  > 0 <=>  cosG  + or  sinG  > -1 — * 6— 

2 - “ HP 


An  application  of  the  lemma  completes  the  proof. 


Statement  10  (consequence  of  9) 

From  9,  it  follows  that  orbits  with  a2  = 0 cut  the  line  y2  = 0 
in  a direction  orthogonal  to  the  enveloping  lines  of  these  orbits  (i  = 1,  2). 
Thus  the  lines  p2  = 0 must  pass  through  the  points  of  tangency  of  the 
enveloping  lines  with  the  boundary  of  R. 

We  further  observe  that  to  the  "right"  of  the  line  \i  = 0,  Xi  *s 
acute,  while  to  the  left  of  the  line  Yi  = °j  Xi  is  obtuse.  The  results 
implied  by  figure  1 are  easy  consequences.  In  particular,  we  see  for 
example  that  any  orbits  in  the  regions  I,  F,  I";  II,  II*,  II"  have 
c*ia2  >0  while  those  in  the  regions  III,  III*  have  a;x<*2  < °*  T*ie  sit“ 
uation  in  the  strips  is  not  as  simple,  but  is  fairly  clear. 


Figure  1. 

1 ) The  (two)  solid  dark  lines  through  the  points  A and  D are  the 
enveloping  lines  of  solutions  with  aj  = 0.  The  corresponding  lines 
through  B and  C are  the  enveloping  lines  of  solutions  with  az  = 0. 
Any  solution  with  ai  = 0 or  a2  = 0 must  lie  in  the  corresponding  strip 
bounded  by  these  lines. 
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-V.  t 

2)  At  P,  the  shaded  wedge  indicates  the  directions  at  P for  which 
the  corresponding  solution  has  a,  <0.  At  P*  the  shaded  wedge  indi- 
cates a j > 0.  Similarly  at  the  wedge  indicates  az  < 0,  at  Q1, 

a2  > 0.  On  the  dotted  line  AD  the  wedge  has  angle  ir  corresponding 
to  \i  = 0.  CB  has  a similar  meaning  with  regard  to  the  strip  for  az  . 

3)  The  solid  lines  parallel  to  the  strips  indicate  the  regions  where 
the  corresponding  ai  > 0 for  all  possible  angles.  The  dotted  lines 
similarly  indicate  where  < 0. 

4)  Thus  we  can  see  that  in  regions  I,  I1,  I",  both  of  al}  az  are 
positive,  while  in  the  regions  II,  II1,  II",  a,  and  az  are  negative. 
Finally  in  regions  III,  «i  > 0;  az  < 0 while  in  IIP,  < 0;  az  > 0. 

5 ) In  the  strips  we  must  determine  the  sign  of  a from  the  direc- 

tion of  the  velocity  vector:  e.  g. , at  P,  any  solution  whose  velocity 
vector  lies  in  the  shaded  wedge  has  az  > 0,  < 0,  etc. 

Thus  we  have  a geometric  criterion  for  determining  whether  or  not  a 
solution  will  make  a transit  of  the  equilibrium  region.  Note  in  particu- 
lar that  such  a solution  must  stay  inside  one  or  the  other  of  the  strips 
away  from  the  equilibrium,  and  that  as  it  crosses  the  equilibrium  region 
it  changes  strips.  Solutions  going  from  right  to  left  are  "on  the  bottom"; 
those  from  left  to  right  on  top. 

We  conclude  with  a remark  which  may  have  some  "engineering" 
value: 


Statement  11.  The  (two)  solutions  for  which  ||3  [ = 0 are  hyperbolas; 
these  solutions  correspond  to  those  orbits  which  cross  the  region  of  the 
equilibrium  point  the  fastest. 

(Corresponding  solutions  for  the  restricted  problem  exist  and  are 
well  approximated  by  these  — in  the  equilibrium  region  — for  energies 
slightly  larger  than  that  of  the  equilibrium. ) 

The  equation  for  these  orbits  are 

-a  Xj  = vt  zz 
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I r 


or 

-<r  xj  + x22  * 2hv2  (bT2  + v2  ) b"1 

2h  v2 
a - 

©2  b 

Proof:  ' Statement  8 plus  some  algebra. 

(Note  that  the  left  hand  side  is  determined  from  geometrical  con- 
siderations alone,  while  the  right  hand  side  follows  by  letting  xj  * 0 
and  using  the  energy  equation. ) 

This  completes  the  present  collection  of  statements. 
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INTRODUCTION 


To  date,  no  closed  form  solution  of  the  equations  representing 
minimum  fuel  flight  of  a high  thrust  vehicle  operating  in  a vacuum 
under  an  inverse  square  gravitational  attraction  has  been  determined. 
Optimum  trajectories,  under  these  conditions,  must  therefore  be 
calculated  by  numerical  methods  and  iteration  techniques. 

On  the  other  hand,  the  powerful  methods  of  classical  (or  varia- 
tional) mechanics  hold  promise  of  solving  "allu  dynamical  problems. 

The  "only"  difficulty  being  the  establishment  of  a Hamiltonian  function 
in  a separable  form.  The  solution  of  MallM  dynamical  problems  using 
these  methods  will  therefore  not  be  imminent  pending  the  development 
of  a general  transformation  procedure  that  will  transform  the  Hamil- 
tonian of  any  given  problem  into  a separable  form. 

This  paper  presents  a brief  discussion  of  the  classical  procedures, 
discusses  both  closed  form  and  several  approximate  solution  procedures 
and  shows  the  level  of  application  to  the  minimum  fuel  trajectory  problem. 
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THE  PROBLEM 


The  physical  problem  will  be  taken  to  be  the  determination  of 
trajectories  for  minimum  fuel  consumption  for  vehicle  flight  in  a 
vacuum  under  the  influence  of  a high  level,  constant  thrust  and  an 
inverse  square  gravity  field.  This  of  course  is  not  the  most  general 
problem  which  could  include  variable  thrust  levels,  higher  order 
gravitational  attractions,  atmospheric  loads  and  disturbances,  and 
numerous  other  variables.  However,  it  is  general  enough  to  des- 
cribe most  of  the  solution  difficulties  inherent  in  this  type  of  problem.  ' 
The  two  dimensional  equations  of  motion  of  a point  mass  vehicle 
subjected  to  the  forces  described  above  may  be  expressed  in  cartesian 
coordinates  as 


_F 

sin  X - x 

m 

r 

(i) 

F 

m 

cos  X - ■ 3^  y 
r 

where  x and  y are  horizontal  and  vertical  coordinates  respectively, 

(ji  is  the  gravitational  constant,  F is  the  constant  thrust,  m is  the  vehicle 
mass,  X is  the  angle  of  thrust  direction  measured  from  the  vertical 
and  r is  the  radius  or  distance  of  the  vehicle  from  the  center  of  at- 
traction (r  = fx2  + y2"j  2 ).  Specifying  now  that  the  mass  flow  m shall 


* 

be  maintained  at  a constant  rate  K,  and  introducing  new  variables  as: 

qi  = x,  q2  = y,  q3  = x,  q4  = y,  qs  = m (2) 

The  equations  of  motion  in  first  order  form  become 

F . u 

qi  = — sm  X -~p<l3 

F u. 

q2  = — cos  X -"3  q4 

q5  (3) 

q3  = qi 

q4  = qz 
q.5  = -K 

It  is  noted,  that  due  to  the  constancy  restriction  on  the  mass  flow, 
a minimum  fuel  trajectory  is  now  analogous  to  a minimum  time  trajec-r 
tory.  The  problem  now  is  to  determine  the  control  variable  \ such  as 
to  insure  that  any  trajectory  obtained  through  an  integration  of  equations 
(3)  will  be  a minimum  time  (fuel)  trajectory.  It  is  therefore  necessary 
to  apply  some  analytical  optimization  technique.  Both  the  Calculus  of 
Variations  and  Pontryagin's  Maximum  Principle  are  usable  here  and 
yield  identical  results.  However,  since  it  will  be  necessary  to  have  a 
Hamiltonian  available  for  later  applications,  the  Pontryagin  technique 
(Reference  1)  will  be  used. 

Defining  the  auxiliary  variables  as  pj  (i=l,  ...  5),  the  Pontryagin 
Hamiltonian  function  becomes 
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J*  u 

H(p  ,q  (P,sinx+P2  cos  x)  (pi  q3  + p2  q4 ) 

i i q 1 r 

+ p3qi  + p4q2  - P5  K 

The  condition  that  this  function  maintain  a maximum  is  then 


(4) 


9H  _ 

9X 

from  which 


0 


— (pi  cos  x-  P2  sin  ^ 
<15 


(5) 


Tan  x=  Pi  /P2 


Hence 

sinx  = (Pl2  +pD,/j cosx  = <i»  +p'),/; 

Substitution  of  equation  (6)  in  (4)  then  yields 


(6) 


H(p.,q.) 

l l 


F_ 

qs 


<p2  * p2  )‘i 


(pq  + pq)  + pq+pq  =p  K 

r 13  24  3142  5 

(7) 


The  equations  may  then  be  expressed: 

q.  = 9H/9p.  ; p.  = - 9H/9q.  (8) 

(i  = 1,-5) 

The  problem  of  obtaining  the  optimum  trajectory  now  becomes  the 
problem  of  integrating  equations  (8). 
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APPROACH 


The  approach  used  to  study  solutions  of  equations  6 is  the 
Hamilton- Jacobi  theory  of  canonical  transformations.  This  theory 
was  developed  for  basic  dynamical  systems,  however  it  is  applicable 
to  any  system  whose  governing  equations  may  be  expressed  in  first 
order  form  as 

q = 3F  / 8p.  ; p.  = -8F/8q.  (i  = l---n)  (9) 

^1  i i i 

Where  the  function  F(q.,p.)  is  not  restricted  to  the  Hamiltonian  of 

i i 

classical  mechanics,  but  can  be  any  function  which  allows  presenta- 
tion in  the  above  canonical  form.  It  is,  however,  usually  referred  to 
as  the  Hamiltonian  function  or  simply  the  Hamiltonian. 

Now,  examining  the  equations  (9),  it  is  seen  that  if  one  of  the 

q (or  p ) is  not  present  in  the  Hamiltonian  (i.  e.  if  a variable  is  cyclic 
i i 

or  ignorable)  then  the  partial  derivative  of  F with  respect  to  that  vari- 
able is  zero  and  the  corresponding  p^(or  constant.  Consequently, 

if  the  system  can  be  transformed  to  a new  system  of  coordinates,  while 
maintaining  the  canonical  form,  such  that  all  of  the  new  coordinates  and 
their  conjugates  except  one  is  cyclic,  then  the  problem  is  solved.  The 
most  direct  way  to  do  this  is  to  set  the  Hamiltonian  itself  equal  to  the 
one  non  cyclic  new  coordinate. 

F^P-.Q.)  = Qi 
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which  gives 

o 

ii 

H 

•a 

Pi  = - i 

Qi  = 0 

P.  = 0 
1 

(i  = 2,  . . . n) 

(10) 

Hence  all  Q = constant  = a.  and  all  P = constant  = P.  except 
i 111 


Px  = (Pi-t). 


It  is  however  necessary  to  determine  the  canonical  coordinate 
transformation  required  to  transform 

F(q.)Pi)=>F'  (Q.,P.)  - Qx  (ID 

To  do  this,  it  is  necessary  to  introduce  a generating  function 
S(  q^,  P.)a  function  of  one  set  of  old  variables  and  one  set  of  new 
variables.  The  transformation  equations  may  then  be  written 

p = 3S / 3q  ; Q.  = 3S/3P.  (i  = 1 . . . n)  (12) 

l l i i 

S(q  , P ),  however  must  still  be  determined.  This  may  be  done 
i i 

(theoretically  at  least)  by  substituting  the  applicable  transformation 
equation 

p.  = 3s/aq. 

into  the  old  Hamiltonian  and  setting  it  equal  to  the  new  Hamiltonian 

F(q^,3S/3q^)  = (13) 

"There  will  be  no  discussion  here  as  to  the  basic  differences  between 
Hamilton's  Principle  function  W and  Jacobi's  function  S.  Also  S may 
take  any  of  the  four  forms  S(q.,P  ),  S(q^,Q.),  S(Q_.,pJ  or  S(p^,  P J as 
needed  in  a particular  problem.  A discussion  of  these  areas  appears 
in  Reference  2. 
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S(q. , P.)  is  then  determined  through  the  solution  of  the  partial 
differential  equation  (13)  which  is  usually  referred  to  as  the 
Hamilton-Jacobi  equation.  With  S(q.,P.)  known,  the  necessary 

Ik  1 

* transform  relations  may  be  obtained  from  equations  (12). 


P.  = p.(Q..  P.) 
J J i i 


j(l n) 


(14) 


q.  * qjlQ.PJ 

One  further  item,  the  canonical  perturbation  technique,  might 
be  mentioned  before  concluding  this  discussion  of  the  procedure  used. 
Often,  it  is  possible  to  divide  the  Hamiltonian  into  the  sum  of  two 
parts  one  of  which  may  be  considered  as  a perturbation.  The  equations 


then  appear  as 

9F  9F 

9 i . _ 

T;  — “ Q_  ’ Pi 


-9F 


9F 

° + 1 


9p.  9p.  ’ i 9qi  3qA 


(15) 


where  F = F -F 
o i 


The  procedure  then  is  to  neglect  the  Fi  portion  and  solve  the  equations 


9F 


<1;  = 


9 Pi  ’ 


P;  = 


9F 


a<li 


(16) 


using  a generating  function  S(q^P)  and  the  Hamilton-Jacobi  relations 

♦ 

to  obtain 

q.  =q.(Q..P.)  ; Pj  = PjlQ/J  (17) 

These  solutions  (17)  to  the  first  part  of  the  problem  arfe  then  sub- 


stituted into  the  original  F = F - F and  into  the  original  equations  (15). 

° O 1 


* 


r 
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Then,  after  several  straightforward,  but  lengthy,  manipulations 


(See  References  3 and  4),  the  following  equations  in  the  new  vari- 


ables result. 

3F  (P.,Q.) 

Q _ 1 1 L . p _ 

i 3 P.  * i 


-3F  (P.,Q.) 

l i i 


Hopefully  then,  Fj  (PpQ^  is  in  a simple  form  such  that  the  Hamilton- 
Jacobi  equation  for  this  part  of  the  problem  may  be  solved  either  com- 
pletely or  approximately. 

The  net  result  of  these  procedures,  whether  the  direct  approach 
or  a perturbation  technique  is  used,  is  that  the  problem  of  integrating 
the  original  equations  of  motion,  equations  (9),  has  been  "reduced"  to 
the  problem  of  finding  a solution  of  the  Hamilton- Jacobi  equation, 
equation  (13). 
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SOLUTION  OF  HAMILTON-JACOBI  EQUATIONS 


Two  methods  which  give  closed  form  solutions  to  some  Hamilton- 
Jacobi  equations  are  the  method  of  separation  of  variables  (Reference 
Z)  and  a closely  related  though  more  orderly  method  known  as  Jacobi's 
Method  (Reference  5).  The  method  of  separation  of  variables  is  pro- 
bably the  easiest  method  of  solving  the  Hamilton- Jacobi  equation  when 
it  is  applicable.  However,  in  application  the  method  is  not  well  organ- 
ized and  is  quite  dependent  upon  the  skill  of  the  operator  to  "see"  the 
separation.  Also,  the  question  of  whether  or  not  the  equation  is 
separable  depends  upon  the  coordinates  employed.  The  restricted  two 
body  problem  is  separable  in  polar  (or  spherical)  coordinates,  but  not 
in  cartesian,  and  the  coordinates  for  which  the  famous  three  body  pro- 

4 

blem  is  separable  have  evaded  investigators  for  years. 

Some  insight  into  whether  or  not  the  H-J  equation  is  separable  in 
a particular  system  of  coordinates  may  be  gained  through  the  develop- 
ment of  a separation  criteria. 

The  real  question  of  separability  is  the  question  of  whether  functions 
of  the  form 

p.  = p.  (q.,  a , , a ) (19) 

i lii  n 

can  be  found  so  that  when  substituted  in 

H(qi>9L2>  — . qn.pi.pz, — pn)  = E * 

will  cancel  out  all  the  q.’s  is  to  be  answered. 

1 
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Our  purpose  is  to  find  the  condition  that  the  Hamiltonian  be 
separable  with  respect  to  a set  of  coordinates.  If  this  condition 
is  not  satisfied  in  one  set  of  coordinates,  then  one  needs  to  find  the 
proper  coordinates  which  satisfy  the  condition. 

Now  let  us  assume  that  we  can  find  p^  as  in  ( 1 9)  which  satisfies 
(ZO).  It  follows  that  p.  and  its  derivative  with  respect  to  q.  are  func- 

1 „ l 

tions  of  a single  coordinate  Differentiate  (ZO)  with  respect  to  q., 
we  obtain: 


9H  + 8H_  8pi  _ 

8q.  8p.  liqf 

Let  us  introduce  a new  function  p.  of  the  form: 

l 

p = f(qi,q2  ---,q  ; pi  ,p2  , p ) 
i n n 

such  that  it  will  satisfy  the  relation: 


(21) 


(22) 


8H  8H 

3q.  8p.  Pi  ” 0 

l l 

By  comparing  (19)  and  (Zl)  we  obtain: 

8Pi 

Pi  " 9qt 

and  thus  p^  is  a function  of  q^  alone,  since  p.  is  a function  of 

alone  by  (19).  By  differentiating  (Z3)  with  respect  to  q.,  and 

in  mind  relation  (ZZ),  we  obtain 

8p.  3p.  8P. 

it  +~dp  ^ = 0 for  j * 1 
j j j 


(23) 


(23a) 


qi 

keeping 


(23b) 
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From  (23a)  and  (23b),  we  obtain: 


(23c) 


Differentiating  equation  (23)  with  respect  to  q^,  and  using  (23a),  we 


obtain: 

a2  H 92  H /V  H 92  H \ 9H  /9pj_\  = 

8^  'a^SPj  pj  \8P8^  ' aPjSPj  pj ) pi ' sPjK  ) - 


9Pi 

Using  (21)  and 


= 0,  we  obtain: 


a2  h 

3q.9q. 

i J 


a2  H 

3q.9P. 
i J 


an/a 
aH/apj 


3j  - 


V H 

9P.3q. 

I i J 


a2  h 

ap.ap. 

1 


8H/8q:  \ St/aq, 

j 


= 0 


By  simplification 

a2  H 3H  3H 

9q.9q.  3P.  9P. 
4i  J i 3 


32  H 9H  8H  a2  H 

aq.ap.  3q.  3P.  ap.9q. 

4i  4J  i 1 J 


9H  3H  a2  H 3H  9H_ 

aq.  ap . ap.ap.  aq.  aq. 
4i  j i j i j 

(24) 


for  i,  j = 1, 2, , n and  i ^ j 

Therefore,  the  necessary  condition  that  (20)  be  separable  is  condition  (24). 
It  can  be  easily  shown  that  the  validity  of  equation  (24)  is  also  sufficient 
for  the  integration  through  separation  of  variables. 

One  interesting  case  of  separability  is  the  case  where  the  motion  is 


known  to  be  periodic.  In  this  case,  the  proper  coordinates  are  the  action 
and  the  angle  variables,  and  the,H-j  equation  is  separable. 

If  the  H-j  equation  is  separable  in  more  than  one  set  of  coordinates, 
then  this  case  is  said  to  be  degenerate.  There  is  similarity  between 
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this  degeneracy  and  the  general  one.  Consider  the  general  equation: 

QX  = Q X 
o 

where  Q is  an  operator,  is  a constant,  and  X is  the  characteristic 
function.  It  is  clear  that  for  each  value  of  Qq  there  corresponds  one 
or  more  X.  In  case  there  is  only  one  X,  then  Q is  said  to  be  non- 
degenerate, otherwise  it  is  called  degenerate. 

The  similarity  of  the  H-j  equation  with  the  general  case  above  can 
be  visualized  by  taking  the  Hamiltonian,  H,  as  the  operator,  Q,  the 
constant, a,  as  Qq  and  the  generating  function,  S as  X.  If  we  define  the 

Hamiltonian  operator:  H = H (a.,  X.,  — ) 

.1  1 o X. 

l 

as  having  the  property 

ft  ftS 

H (*.,  X.  , — - ) S = H (*.,X.,— ) 

1 l 9X  i i 9X. ; 

l i 

H (or  X.)  S = S 
l i 

then  our  H-j  equation  will  take  the  form: 

HS  = aS 

Thus  H is  degenerate  or  non-degenerate  according  to  the  number  of 
solutions  of  S if  it  is  one  or  more.  This  is  equivalent  to  saying  that  the 
equation  is  separable  in  one  set  of  coordinates  or  more. 

As  mentioned  before,  "Jacobi's11  method  for  obtaining  solutions 
to  first  order  partial  differential  equations  appears  more  orderly 


than  the  separation  techniques.  Since  this  method  does  not  appear 
to  frequent  the  literature  as  much  as  the  separation  procedures,  a 
brief  development  is  presented  here. 

The  solution  of  the  H-j  equation  involves  the  determination  of 
the  generating  function,  S.  The  Hamilton- Jacobi  equation  may  be 
written  in  the  form 


F(q.,q  , - ^ q >p  >p  — p ) = 0 
i 2 n 1 2 n 


(25) 


where 


as 


F = H -a  : P.  = 7- — and  H is  the  Hamiltonian. 

1 oq^ 

Second,  we  try  to  find  (n-1)  compatible  functions  to  F,i.e.  (n-1) 

additional  functions  F^'s,  which  satisfy  (25),  i.  e.  , 

F.(q  ,q  , - -,q  ;p,p P ) = or.,  (i  = 1 , 2,  — -n-1)  (26) 

112  n 1 2 n 1 

where  the  a.  are  arbitrary  constants.  Third,  the  P ,P  , - -P  can 
1 x 2 n 

be  determined  from  (25)  and  (26)  as  functions  of  q's  and  u's  and  such 
that  these  functions,  when  inserted  in  the  differential  relation 


dS  ,=  P dq  + P dq  + - - -P  dq 
1122  n n 


(27) 


yield  an  integrable  equation.  The  result  of  integrating  (27)  whereby 
an  arbitrary  constant  is  introduced,  is  our  generating  function. 

Since  the  proof  is  too  long  and  complicated  in  the  general  case, 
let  us  show  the  procedure  for:  n = 3. 


F (q  , q , q ,P  ,p 

12  3 12 


P ) = 0 

3 


(28) 
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Let  us  find  two  particular  integrals  of  (28)  as  follows: 


F(q,q  , q , P , P , P ) = a 

1 1 2 3 1 2 3 l 


F (q  , q ,q  ,P  ,P  ,P  ) = «2 

2 12  3 12  3 

where  P ,P  , P are  functions  of  q , q , q . 

12  3 12  3 

Since  F , F are  integrals,  the  "Poisson  brackets" 


(29) 

(30) 


[f.Fj]  =0 


and 

[f.fJ  = ° 

Moreover,  F\  and  F2  must  be  compatible,  hence 

CvFJ  =° 

Now  solve  (28),  (29),  and  (30)  for  , P , P^  and  form 
dS  = P dq  + P dq  + P dq, 

11  2 2 3 


(31) 


(32) 


(33) 


(34) 


which  is  required  to  be  integrable. 

In  order  to  find  the  relations  between  the  F.'s  and  P.'s  which 

i i 

satisfy  the  above  conditions,  we  expand  (31)  inthe  usual  form: 

8F  8Fi  8F  SFj.  8F  aFj_  8F  8Fi  9F  8Ij_  OF  9^  _ Q 

8q  9Pi  + 8q  9P  8q  9P,  " 8P  9q,  " 8P,  8q  " 8P,  8q, 

1 2 2 33  l1  2 2 33 

This  is  a homogeneous  linear  partial  (differential  equation)  for  de- 
termining Fi.  Its  subsidiary  equations  are 


dPi 

9F_ 


dP2  dP3  _ dqi  _ dq2  _ dq3 
8F  8F  8F  8F 


8F 

5 


(35) 


9q_ 


8Pi 


8P2 


8P, 
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These  relations  also  serve  as  subsidiary  equation  to  (32)  for  the 
determination  of  F2»  Therefore,  if  one  finds  from  (35)  two  in- 
dependent integrals  Fi  = aj  and  F2<  = -a2  , then  all  the  relations 
(31)  (32)  and  (33)  will  be  fulfilled,  and  our  task  is  accomplished- 

4 

The  procedure  for  the  general  case  is  exactly  the  same. 

If  given  the  partial  differential  equation 

F(qi  ,q2  , qn;  Pi  » P2  > pn)  = 0 

then,  form  the  subsidiary  equations 


dPi 

_ d*V  - 

_ _= dPn 

dqi 

- dqz 

_ dTn 

9F 

_9F_ 

9F 

' 9F 

9F 

' 3F 

9qx 

8q 

2 

3q 

n 

" 3 Pi 

"3PZ 

" 9P 

and  find  (n-1)  independent  integrals 
F.  = a.  i = 1 , 2 , n-1 

l l 

such  that 

[F.,  Fj  = 0 i,j  = 1,2, n-1  i?*j 

Then  solve  the  n equations 

F = 0 and  F.  = a.  i = 1 , 2,  - - - , n- 1 
i i 

for  the  P's  in  terms  of  q's  and  a's,  and  insert  their  expressions  in 
dS  = Pi  dqi  + P2  dq2  +'---+  pndqn 
Integration  of  this  equation  leads  to  a complete  integral  of  S. 
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APPLICATION  TO  THE  PROBLEM 


Canonical  perturbation  techniques  (using  Jacobi's  method  to 
solve  the  Hamilton- Jacobi  equation)  may  be  applied  to  the  problem 
of  equations  (6)*  To  illustrate,  consider  equations  (6) 


«Li  = gT  ; p^-SH/aq.  (i  = 1, 5) 

pi 

with  H given  by  equation  (5)  as 


h = — /p  + p + P3qi  +P4q2-psK-  -3—  (pi q3+P2  q4)  (5) 

qs  v 1 2 r 


define 


F r~i 2“ 

= — /pi  + p2  + p3qr  + p4q2  - Ps  K 

° qs  V 


„ GM 

Hi  = —3-  (pi  q3  + P2  q4  ) 
r 


The  equations  are  then  expressed 


9Ho  9Hi 


.9H0  9H 


4i  = “9^"  ' 9^  ; Pi  = a^  + a^T  (i  = 1 " " "5) 


ZERO  GRAVITY  APPROXIMATION 


Consider  first  the  problem 


qi  9p.  ’ Pi  " 3q. 

1 1 


The  Hamilton  - Jacobi  equation  for  this  problem  is 


— ,/pi  + pi'  + P3qi  + p4q2  - Ps  k - p5  =0 

m w 
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Where  P is  the  introduced  constant.  The  subsidiary  equations 
5 

(analogous  to  equations  35)  of  Jacobi’s  method  are  then 


dpi  _ dp2  _ d£j_  _ _dp4  _ 
P3  P4  0 0 ' . 


dps 


qs 


-g 


+ P 2 
2 


dqi 


-E  Pi 
P5  P,"  + p‘ 


dq2 


F 


qs 


/Pz  ~ — r 

^P1  + P2 


Jaa 

-qi 


dq4  _ dq5 

-q2  " K 


1 2 

The  third  and  fourth  conditions  give 
P3  = P3  = const. 
p4  = P4  = const. 

as  expected  since  P3  and  p4  are  cyclic  in  Hq. 

From  the  first  and  last  of  equations  (40): 
P 3 . 

dpi  = dq5 

px  = ^ qs  + pi 

From  the  second  and  last  of  equations  (40) 

P4 


dp2 


K 


-dq5 


P2  = ^ qs  + p2 


(40) 


(41) 

(42) 


(43) 


(44) 


Then,  substituting  equations  (41),  (42),  (43),  and  (44)  into  equation 


(39),  P5  becomes 
F 


P5  = 


q5  K 


(-£  qs  + Pi  )2  + t q5  + p2  )z 


1/2 


P3 

+ K qi  + “ qz  ■ 


P4 

K 


Ei 

K 


(45) 
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The  equation  .for  obtaining  the  generating  function  (analogous  to 
equation  34)  is 

dS  = Pl  dqi  + p2  dq2  + p3dq3  + p4  dq4  + p5  dq5 
Substitution  for  the  p^  and  integrating  gives: 

S = ^”k  + Pl  ^ + + p2  )qz  + p3q3  + p4q4  - qs 


K 


K 


lfr+  Pi  P3  + P2  P4 


K. 


{C  + 


where  C 


4 


2 2 

p3  + p4 


in  A 


4 


Pi  + p|  in  b: 


(~3  qs  + Pi  )2  + qs  + p2  )2 


K 


A = Kl 


B = - 

qs 


/ 

V 


P32  + P42  C + (P|  + Pl  ) ^5.  + (Pi  P3+P2  P4) 


Pi2  + P 2 C + (Pf  + P|  ) + (Pi  P3  + P2  P4) 


SL 

K 


(46) 

(47a) 

(47b) 

(47c) 


The  transform  relations  are  then  obtained  from 


as 


q.  = as/ap. 

1 1 


If  (Pi-P3Qs) 


qi  Ql  “k  h o 

2(Pi  P,  +P2  P4) 
ak  + P| 

zMM 


M + p 


in  A 


yp2  + p; 


*n  B 


y<pj  + pi)(Pi  -P3Q5 ) 


+ p. 


BK  Q5 
q2  = Q2 


ac  + '/pT+pF  (Pi-P)Q,)  + 2Pi  ,p  j, 


yPi+P: 


48) 


if 


?2  -P4Q5  , P4  , A P2 

c yp3  + p4  Vp2  + p2 


in  B 


+ 2(pz.?l+£a-P4) 

ak  yp3  +p| 


r Vp23  + Pl  (Pz  - P4Qs  ) 


+ Pi 


(49) 


2 VP?  + P22 


BK  Qs 


2 + P | (P2  -P4Qs)  

c 4 p}  +p2 


-+Jf4r  + zp2  -P4Q5 
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P3C 


~ ^ ^ F J 2(Pi  Pj+P,  P4)  „ „ _ , p3 

q3  = Q3+Q1Q5  - K j AK»/p|  + P \ [Pl“  3°5 +^f 


+ P‘ 


2>JPi 


2 +p| 


BK 


Pi  -P3Q5  + 


Pi  + P3Q5 


P3(Pi  P3+Pz  Pj.) 
(P|  + P 4 ) 

■aa  m b 


in  A 


M + P 


(50) 


q.4  = Q4  + Q2  Q5  - TT 


F f 2(PiP3+P2  P4) 


K | AK/P23  + Pf  [ Pz  'P4Q5  7pT+P4 


P4C 


. 2M+P| 

+ — BK ' 


s|  Pz  -p‘Qs  + ^TPl' 


+ ,/p! 


4*[ 


/pf 

'3  + F 

(Pj  ' * P4 


(51) 


p2+plQs  - a, at* a»?«i  lnA.^tnBJ 


q5  = - KQ5 

CONSTANT  GRAVITY  - FLAT  EARTH 


(52) 


It  is  now  desired  to  perturb  this  zero  gravity  solution  into  a 
solution  to  the  constant  gravity  flat  earth  problem.  The  equations  are 
then 


Q = ^ 

1 op. 

X 


• -9H1 

■p  — 

i ’ 8Q. 


where 

H'  = P5  - g (P2  -P4Q5  ) 

Specifying  a determining  function  W = W (Q^,\  ),  the  Hamilton  - 
Jacobi  equation  becomes 


(53) 


(54) 
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9W_ 

9Q5 


8W  ' 3W 
g KFT  + gQs  5FT.  - x5  =0 


(55) 


9Q2  63  3Q4 

Assuming  a solution  for  W as 

W = WxtQi)  + W2  (Q2  ) + W3  (Q3)  + W4(Q4)  + W5  (Q5  ) (56) 

The  Hamilton- Jacobi  equation  becomes 


+ gQs 

9Q5  gWs  3Q4 


dWj 

3Q2 


- \5  = 0 


(57) 


Since  the  coordinates  Qi  , Q2  , Q3  and  Q4  , are  cyclic,  Pi  , P2  , P3 
and  P4  are  constants 

Pi  - Xi  ; P2  = ^2  P 3 = ^3  P 4 - ^4  (58) 

Hence  equation  (57)  becomes 


3Wc 


+ g \4  Q5  - (g\2  + X5  ) = 0 


3Q5 

which  integrates  to  give 

i 

2 


■^g  ^4  Q5  + (g^2  + X5  ) Q5 


w5  = 

Wj  through  W4  are  determined  from  equations  (58)  in  the  form 


W.  = X.Q.  (j  = 1,2,3  ,4) 

Then,  W becomes 

W = Xx  Qi  + X2  Q2  + X3Q3  + X4 Q4  + (gX2  + X5  )Qs  ^ ^4 Q5 
The  coordinates  are  then  obtained  from 


(59) 


(60) 


(61) 


(62) 


x.  = 9W/9X. 
1 1 


(i  = 1 5) 


which  gives 
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xi  = Qi 

X2  = Q2  + gQ5 

X3  = Q3 

X4  = Q4  “ g/  2 Q5 
x5  = Q5 

and  from  equation  (53) 

X5  = P5  + gp4Qs  “ gpz 
and  the  new  Hamiltonian  becomes 
H = X5 


(63) 


(64) 


with  the  equations 

x = 9H/9X.  ; V = -9H/9x. 

iii  1 

which  gives 


x. 

1 

= b = const  i = 

i 

1,  2,  3,4 

1 

= c.  = const  i = 

1 

1,  ---  5 

x5 

= t + bs 

Then  from  63^64^65  and  58 

Qi 

= bi 

Pi  = Cl 

Qz 

= b2  - g(b5  + t) 

p2  = C2 

Qz 

= b3 

u> 

II 

o 

u> 

Qi 

= b4  - | (b5  +t)2 

TJ 

II 

O 

Qs 

= (b5  + t) 

P5  = C5  + gC2 

(65) 


(66) 


4 
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Then  from  equations  43,  44,  52,  and  66 
pi  = ci  - c3  (b5  + t) 

P2  = c2  - (b5  + t) 

Such  that  the  guidance  angle  expression  becomes 

Ta"  X - <|^> 


(67) 


where 

kQ  = c3  /c4  ; ki  = b5  T Cl  /c3  ; k2  = b5  - c2  / c4 

and  tan  \ is  a bilinear  function  of  time  as  expected. 

FORMAT  FOR  INVERSE  SQUARE  GRAVITY  PERTURBATION 
Returning  now  to  the  zero  gravity  solution  of  equations  41, 

42,  43,  44,  45,  48,  49,  50,  51  and  52.  Substitution  into  equation  36 
yields  the  Hamiltonian  for  the  inverse  square  perturbation  term  as 

H*  = p5  - ^r[(Pi  - p3Q5  ) (Qs  - Q1Q5  ) + (P2  - P4Q5  MQ4+Q2  Qs  ) 


F_ 

K 


A 


Pi  -P3Q5  )2  +(P2  -P4Q5  )2 


PiP3+P2P4  1 

“2  + "pf  ‘ °5 


(68) 


V^+P| 


(Ppft  "A2^~  " (p3  + p4  )Q|  + (P1P3  + P2  p4)Q5 


at 


n A1 
Qs 


+ P 


2 (P1+P2  - (P1P3  + P2  P4)Q5  )ln  B 
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Where  A and  B are  defined  in  equations  47. 

The  accompanying  equations  of  motion  are  then 

Q.  = 8H*/8P.  P.  = -8H^/8Q.  (69) 

1*1-1  1 

The  presence  of  the  numerous  radicals  and  logarithmic  terms  make 
the  attainment  6f  a solution  of  the  accompanying  Hamilton- Jacobi 
equation  quite  improbable  by  ordinary  means.  Thus,  the  use  of  this 
perturbation  method  and  the  Hamilton- Jacobi  technique  displays  little 
overall  advantage  in  obtaining  a closed  form  solution  to  the  general 
problem. 

AN  APPROXIMATE  SOLUTION  - INVERSE  SQUARE  GRAVITY 

The  difficulty  of  obtaining  a closed  form  solution  leads  to  the  devel- 
opment of  an  approximate  solution  which  is  taken  as  a first  order  im- 
provement on  the  constant  gravity-flat  earth  solution.  Taking  the 
complete  Hamiltonian  of  equation  5,  the  Hamilton- Jacobi  equation  may 
be  written 

— '/pi2  + p!  + P3qi  + P4q2  - Ps  K - ^jr-  (tnq3+P2  q4)  - P5  = 0 (70) 
qs  r 

where 

p.  = OS'/  9qi 

The  subsidiary  equations  of  Jacobi's  method  are  then 
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n~.  2 F pi  GMq3 

I pi  + P2  - — prr rr“ 


qstfpi  + P2 


/ 2 . 2 GMq4 

qs  V pi  + P2  - 'r3 


_ d qi  _ dq4.  _ Jas . (7i) 

-qi  -q2  K 

By  comparing  the  above  equations  with  the  subsidiary  equations  of  the 
flat  earth  problem  it  is  seen  that  the  primary  differences  are  in  the 
denominators  of  the  dp3  and  dp4  terms  and  there  is  an  additional  term 
in  the  dqi  and  dq2  denominators.  Therefore,  let  the  change  in  the  p3  and 
p4  terms  be  of  order  em  over  the  constant  result  of  the  flat  earth  pro- 
blem. 


P3  = p3  + 2€i  qs 

(72; 

p4  = P4  + 2e  z q5 

where  €i  and  €2  are  unknown  small  constants.  Substitution  into  the 

subsidiary  equations  then  gives;  from  first  and  last  equation 

dpi _ dq5 

P3  + 2€iq5  K 

pi  = pi  + (P3qs  + € l qs  ) (73 


4 
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similarly  from  second  and  last  equation 

dp2  _ dq5 

P4  + 2£2q5  K 


+ y;  (p4  qs  + '2  qs  2 ) 


P2  = ^2  K 

Substitution  of  72,  73  and  74  into  70  and  solving  for  p5  gives 


(74) 


P5 


^{[(p!  +|(p3q5  +<iq D)'  + (p2  + ^(p4q5+62q52))' 


1/2 


F_ 

qs 


+ (P3  + 2e  1 q5  ) qi  + (P4  + 2 qs  )q2 


(75) 


GM 


Pi  + ^ (p3q5  +£  1 qs  ) ]q3  + (p2  + |;(p2  is  +e  2 qf ))  q^J 


^ p 


■} 


Now,  since  p3  and  p4  were  approximated  it  should  not  be  expected  that 
the  p's  will  make  the  function 

dS  = pi  dqi  + p2  dq2  + P3<3q3  + P4  dq4  + Ps  dq5  (7  6) 

an  exact  differential.  Therefore  further  adjustment  must  be  made 
in  p5  to  make  dS  exact.  Hence,  assume 


Ps 


F 

mK 


(P?  + P 2 ) + ^ ( P3  pi  + p4p2  )qs  + * 


K 


(P|  + P 1 + 2Kt  i Pi  + 2Ke  2 p2  ) qs 


1/2 


(77) 


1 1 J 1 T-> 

+ — (P3  + 2e  1 q5  )qi  + ^ (P4  + 2e  2 q5  )q2  + 2e  1 q3  + 2e  2 q4  "KP 
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Substitution -of  equations  72,  73,  74  and  77  into  equation  (76)  and 
integrating  then  gives 


s = - (P3  qs  + ^ i q.5  )qi  + Pi  qi  + (p4q5  2 qs2  )q2  + p2  q2 


+ ( 2e  x q5  + P3  )q3  + ( 2«  2 q5  + P4 ) q4  - qs 


(78) 


P3P1+P2P4 


\/p|  + P42  +2Ke  1 Pi  + 2Ke  2 P2 


in  A' 


v/l^  + P22  in  B'j 


where  C'  =^/(P  1 + P22  ) + ^ (P3P1  + P4P2  ) qs  + ^2 


(P3  + P l + 2Kc  1 Px  + 2Ke  2 P2  ) qf 
A'  = | (P1P3  + P2  P4)  + ^2  (P|+P|  + 2KeiPi  +2K€2P2)q5 

+ 2 v Pi  + P2 


'2  C' 

2 . 2 


(79) 


B*  = — (Pf  + P22  ) + 4 (P2  Pi  +P2  P4 ) + 7T- 

q5  K Kq5 


P32  +P42  +2Kc  1 Pi  + 2Ke  2 P2 


The  new  coordinates  Qj  are  then  obtained  from 

Q.  = 9S/9P. 

1 1 

which  may  be  solved  to  yield  the  original  coordinates  in  terms  of  the 
new  as 


qi 

qz 

q3 

q4 

qs 


Qi  - gi  (Qs  ) 
Q2  “ g2  (Qs  ) 


= Q3  + Qs 

= Q4  + Q5 

= -kq5 


Qi  - gi  (Qs  )]  - — (Qs  ) 

Q2  - g2  (Qs*)]  - (Qs  ) 


(80) 
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Where 


where 


gi  (Q5  ) = 9G/9Pi 
g2  (Qs  ) = 9G/9P2 
fi  (Q5  ) = 9G/9P3 
f2  (Q5  ) = 9G / 0P4 


(Qs  ) + 


Pi  P3  +P2  P4  in  A.(Q5  )_v/Pi  +p|  fnB'l 


\J F^  + F^  + ZKe  1 Px+2Kc2  P2 


7 


Finally,  the  guidance  function  is  obtained  as 

Tan*  - Pi-P3Qs  +e*KQt  (81) 

lanX  ~ P2  -P4Q5  +f2KQs 

A bi-quadratic  form  which  becomes  in  terms  of  time  by  replacing  Q5 


by  its  solution 

Qs  = t + t 


t 


M 
o 

K 


Then 

Tan  X - 


€,Kt2  - (2MQ€i+P3)t  + (£i+P3)  Mq+  pi 

K" 

— M 

€2Kt2  - (2M  € 2 +P4)t  + («  2 +P4)— p2 
L o K 2 


(81a) 


which  is  an  expression  containing  two  unknown  constants  which  may 
be  used  to’fit"  known  solutions  for  guidance  purposes. 
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CONCLUSION 


The  application  of  the  Hamilton- Jacobi  theory  of  Classical  Mechanics 
was  useful  in  obtaining  solutions  to  both  the  zero  gravity  and  the  flat  earth- 
constant  gravity  rocket  flight  problems.  These  solutions  then  led  to  a first 
order  approximate  solution  of  the  inverse  square  gravitational  attraction  pro- 
blem. However,  the  theory  did  not  prove  useful  in  obtaining  a closed  form 
solution  to  the  inverse  square  problem. 

The  development  of  a closed  form  solution  by  these  methods  depends 
on  the  proper  choice  of  coordinates  to  insure  that  the  Hamilton- Jacobi  equation 
is  seperable  or  solvable.  Consequently,  it  appears  that  the  usefulness  of  these 
methods  in  high  thrust  applications  will  be  limited  until  the  development  of 
a transformation  procedure  which  will  transform  the  system  from  the  well 
known  cartesian  or  polar  coordinates  to  a system  of  coordinates  for  which  a 
solution  of  the  Hamilton- Jacobi  equation  is  guaranteed. 
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A FIRST  ORDER  DELAUNAY  SOLUTION 
FOR  MINIMUM  FUEL, 

LOW  THRUST  TRAJECTORIES 


BY 

HARRY  PASSMORE,  III 


BIRMINGHAM,  ALABAMA 


SUMMARY 


¥ 

This  paper  utilizes  the  similarity  of  the  minimum  fuel  trajectory 
equations  to  those  representing  a restricted  three-body  problem  to  gain 
a canonical  formulation  in  the  variables  of  Delaunay.  A two  step  trans- 
form procedure  carried  to  the  first  order  in  small  parameters  is  then 
presented  as  an  indication  of  a rqethod  that  may  be  followed  in  higher 
order  studies.  This  progress  report  presents  the  analytical  develop- 
ment of  the  procedure  as  completed  through  December,  1964. 
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INTRODUCTION 


The  minimum  fuel  equations  of  motion  are  synthesized  from  a 
generlized  Hamiltonian  using  Pontryagin's  method.  These  equations  are, 
of  course,  identical  to  those  presented  in  Reference  1 which  are  developed 
through  calculus  of  variations  procedures.  Examination  of  the  multiplier 
equations  reveals  that  they  may  conceptually  be  considered  as  represent- 
ing the  motiop  of  another  (fictitious)  body  relative  to  the  vehicle.  A trans- 
formation of  the  coordinates  then  yields  equations  relative  to  a common  center 
with  the  vehicle  position  coordinates. 

These  equations  are  then  in  a form  quite  similar  to  equations  represent- 
ing a three  body  problem  in  cartesian  coordinates.  Hence,  they  are  easily 
transformed  into  perturbation  equations  in  elliptic  coordinates  and  thereby 
into  canonical  equations  in  a set  of  variables  representative  of  those  used  by 
Delaunay  in  his  lunar  studies. 

The  disturbing  functions  of  both  sets  of  equations  are  not  identical. 
However,  the  disturbing  function  of  one  set  may  be  separated  into  two  parts, 
one  part  of  which  is  identical  to  the  disturbing  function  of  the  other  set.  Two 
basic  transforms  may  then  be  performed  which  shift  the  periodic  terms  into 
terms  whose  coefficients  contain  higher  orders  of  small  parameters.  The 
method  used  by  Delaunay  is  not  applied  directly.  Instead,  a procedure,  simi- 
lar to  that  attributed  by  Poincare  to  Bohlin,  which  makes  use  of  a determining 


function  to  obtain  the  solution  to  the  desired  order  is  utilized. 


The  complexity  of  the  problem  and  the  magnitude  of  the  task  of 
expanding  the  forcing  functions  and  obtaining  the  transformed  relations 
precludes  a blind  approach  to  a higher  order  solution.  Hence,  a first 
order  solution,  as  presented  here,  will  be  employed  in  an  effort  to  gain 
insight  into  the  order  of  solution  required  to  achieve  the  accuracy  required 
in  space  flight  trajectory  calculations. 
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THE  EQUATIONS 


♦ 


* 


* 


The  two  dimensional  equations  of  motion  of  a constant  thrust  vehicle 
in  an  inverse  square  gravitational  field  may  be  expressed  in  first  order 
form  as 


X4 

= 

- f1 

XI 

I 

+ — 1 

m 

x5 

= 

- $ 

*2 

T 

+ — 
m 

XI 

= 

X4 

X2 

= 

x5 

m 

= 

- e 

(i) 


A generalized  Hamiltonian  function  may  be  formulated  from  equations 


(1)  as 

T 

H = \i  X4+X2  x5  - Jjj  (X4 xi  +X5  x2  ) + — (X4  cos  x + X5  sin  x) 
r m 

- X7  £ where  r2  = xf  + x2 

to  obtain  the  optimum  thrust  direction  requires  that 


0*^  = 0 = - X4  sin  X + X5  cos  X 

tan  X = X5  / X4 

F rom  which 

sin  X = ; cos  X = — 

p P 

where  p = (X4  + Xg)1^ 

Substituting  these  values  for  sinx  and  cos  x * H becomes 

H = Xi  X4  + X2  x5  - ^ ( X4  xi  + X5  x2  ) + — p + X7  or 

r X7 
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U T T 

where  — = 


and 


T/Mn 

m M - i t , £ 

° t 
M0 

x7  = 1 - A-  t = 1 + art 

Mo 

T/M  = f 
o 


— = f/x7  X7  = M X7 

m o 

The  equations  which  must  be  solved  to  obtain  minimum  fuel  trajectories 
become 


X4 

= 9H/ 3X4 

= - ^ XI 

h± 

r 

x7 

p 

X5 

= 9H/9X5 

11 

1 

£ 

+ i_ 

X7 

(x5  /p) 

• 

XI 

= 9H/3X1 

= X4 

X2 

= 9H/3X2 

= x5 

X7 

= 9H/9X7 

= or 

X-4  — “ 3x4  = - Xi 

X5  = - 9h/9xs  = - X2 


- - 9H/3xi  = ^5  X4  - 3^5l  ( X4 xi  + X5  x2  ) 

X2  = - 9H/3x2  = ^3  X5  - ~^j[2  (X4xi  + X5  x2  ) 

' t,  . - 3H/3x,  = V 

X7 

Now,  returning  to  the  expressions  for  sinx  and  cos  X,  and  referring  to  Figure 
1,  it  may  be  seen  that  the  thrust  direction  may  be  considered  as  the  direction 
to  some  fictitious  body  a distance  p from  the  vehicle.  X4  and  X5  may  then  be 
considered  as  the  coordinates  parallel  to  xi  and  x2  of  the  fictitious  body  rela- 
tive to  the  vehicle.  To  obtain  equations- analogous  to  three  body  equations  , the 
X equations  must  be  transformed  to  equations  relative  to  the  same  center  of 
attraction  as  the  vehicle.  This  may  be  accomplished  by  introducing 


* 


4 
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l|^4  = XJ  + X4 


4*  5 = X2  + \5 

^ 7 = X7 

such  that  pis  now 

p = {(+4-Xl)2  + (lp5  - X2  )^j 

or  upon  defining 


(3) 


A2  = 4u  + 4* ! 


p becomes 


p = (a2  - 2 (xx  4<4  + x2  i|i5  ) + rf]  ^ 2 

The  equations  of  motion  (2)  are  then  transformed  to  the  following  second 
order  equations 

I l Vi  -f 

(4»4  -XI  ) 


’•  . UX1  f 

XI  + X=TT  = 


X7  p 

**  + ^ = ^7  (*s  - *!  > 


41 4 - — — ( + 4-Xl)  --^^T  + (xi+4  + X2  <4*5  - 0 

X7  p r r i—  — 

4*  5 = (+  5 -x2  ) - + "-^r2  (xi  «|> 4 + X2  1^5  - rfj 

X7  p r r i—  — 1 

Examining  the  right  hand  sides  of  these  equations  and  defining 


Ri  = p 

X7 

p f p A 

R*  = “ " ■ a ■ •*  2? 


- $ (»♦.  ♦ 


The  equations  of  interest  become 

xi  9Rj 


x»  + P 
x2  + P 


r 

x2 

P" 


/ 9xi 


“5*  - 


9R2 


9Rj 


! 9x2 


0R2' 


(4) 


/ a^4 

AJ  / 94-5 

These  equations  are  then  identical  in  form  to  equations  representing  a res- 
tricted three  body  problem  and  may  be  transformed  by  any  of  several  standard 
methods  available  into  canonical  equations  in  the  Delaunay  variables. 


i 
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= 8 F v / 8f  v 
v 


i,  = - 9FV/9L 


where 


G = 8F  /8  pr 
v v 6v 

K = 8F  /8k 
v 

= 9ft  /9iT 
Gt  = 8FT/8g 


Fv  = S?  - “K  + R- 

FT  = ^ ■ «K  + R, 
T 


g = - 8F  / 8G 
v v v 

k = - 9Fv/ 9K 


*x  = ' aFT/aLT 
gT  = " 9Ft/9GT 


and  the  substitution  X7  = k , 4*  7 = K has  been  incorporated  to  account  for 

the  mass  equation.  The  subscript  v applies  to  parameters  which  represent 

the  vehicle  and  the  subscript  T indicates  parameters  representing  the  thruster  bod^r 


L. 

Now,  upon  adding  - ^7— 2 to  F and 

c*  V 

T 

for  Ri  and  R2  and  defining 


to  F , substituting  the  values 


2 = - H- 


> ♦ Jt, 

A 2r" 


F^  and  F^  may  be  expressed 

2 2 

F - jl  , a 

v 2L  1 " 2L  2 
v T 

F = - F + F2 
I v 


and  the  equations  may  be  expressed 


+ T (xi  y\)4  + x2  4j5  ) 


-«K--  P 


(xj  1^4  +x2  ip5) 


L = 9F  /9f 

V V'  V 

G = 8F  /da 
v v'  ev 

K = 9Fv/ak 


= - 9FU/9LV 
gv  = - 9Fv/9G^ 
k = - 9 F v 1 0 K 


LT  = " 0Fv/9fx  + 9F2  /aiT  ; fT  = + 9Fv/9Lt-9F2  /9Lt 
GT  = - 9Fv/9e  +9F2/9g  ;grT,  = + 9F  /aT  - 9F2  /3LT 


THE  DISTURBING  FUNCTION  EXPANSIONS 


* 


i 


* 


To  obtain  solutions  of  equations  (5)  by  applying  a Delaunay  procedure, 

it  is  necessary  to  express  and  F2  as  series  expansions  in  the  Delaunay 

variables  D,  G,  l and  g and/or  the  closely  related  elliptic  parameters, a 

and  e.  These  types  of  expansions  are  readily  available  in  any  of  several 

texts  on  Celestial  Mechanics.  The  actual  functions  F and  F7  of  interest 

v A 

here  are  not  identical  with  those  found  in  the  texts,  however,  the  individual 

parameters  in  the  functions  are  similar  and  the  expansion  procedures  are 

the  same  * Therefore,  only  the  results  of  the  expansions  taken  to  the  first 

order  in  the  eccentricity  e will  be  presented  here. 

The  expanded  form  for  F is  then 

v 


F = ® K - e»  f-85  <L,t  + 1/2 


2L  " 2L 

v T 


+ . 065  LvLt  (L4  + Lt4)  (Lv+Gv)(LT+GT)[cosUv+gv+k-iT-gT) 

+ cos  tfv+gv-k-*T-gr]jj 

+ . 032  L L (L4  + L 4fl/Z  (L  +G  )(h+G)Cos  (I  +g  + 2k-l  -g  ) 
v T v T v v T T L.  v 6v  T &T 

+ cos  (f  + g - 2k-j0  - gJTl 

v Bv  T °Tj 


+ . 637  e L 2 L (L4  + L 4 
v v T v T 


(VS)  COS  (_gv+fT+gT1 


+ .08eTLvLT2(Lv+Gv)(Lv4  + Gv4)-3/2  [sL^4  + 7^ ^G^osH  + g„-g 


T v T v v ' v v ' L_  v ' ' 

. 258  e^L^tL^  + L^,4)  l!z  ^cos  (k+jf^)  + cos  (k- 


V 6v  eT' 


+ . 194  evL^  Lt(Lt+Gt)(L4+L^)  ^ |cos(-gv+k+iT+gr)  + cos(-gv-k+fT+g 

+ 'tElt(vgv)(lt +E|',/'C194  -•o4LtgT| 

X*  jcos  (^v+gv+k-gT)  + cos  U^+g^-k-  gT i] 


- .85  e„L  4(L  4 + L 4)  cos  i, 

1 t v r 
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-.214  LvLtU-v4+  Lt4)  ,/4(Lv+Gv)(Lt+Gt)  cos  («v+gv-«T-gT> 


+ .52  (L  4+  Lj *■) 


llz 


cos  k 


Vz 


cos  2k 


+ . 127  (Lv4  + Lt4) 

+ '.  057  (L4  + L 4)l/z  cos  3k 
v T 

+ .063  (L  4+L4)l/2  cos  4k 
v T 


3 


where  € i = f/p 

Likewise,  the  expansion,  to  the  first  order  in  the  eccentricities,  of 
F2  is 


F2  = - \ 
_ J* 


L [I 

£ T [_2 


’ + L 


(L  +G  )2  L '2  L 4 (L  +G„) 


3 


+ e 4 L 2 L 1 Li^2  (L  +GJ2  (L  +G  ) (3-L  * (L  +G 

v 2 T 32  v T ' T T7  v v - 


-x 

V V V 


>0 

+ L ~2  (l+6‘LL+-Al  "2  L 4 (L.  +G  )2  (L_+G_)  (5L^-G  JTIcos  i 
T T 1_vt64v  T v v T T T T \ 


cos  i 


64  v 


(L  +G  )2  (L  +G  )2  cos  (21  + 2g  -2£  -2g_) 


+ 128  ® 


v +Gt'  E2-Lv‘,(  VGvII  COS  V^V^T-^T1 


4 fl  6T  V*  V<VG/  (VGt'  COS  '“v^VV2*!’’ 

- Ti  eTLv'2V  < VGv,Z  (VGT)J  (2£v+2gv-3<T-2gT) 

- TTs  %Lv"  V «L;Gv,J  <LT+GT)!  cos(3V2gv-2,T-2gT)J 

where  t)  = 
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THE  FIRST  TRANSFORM 


* 


P 


A previous  section  presented  the  canonical  equations  and  the 

expressions  for  the  forcing  functions  in  terms  of  the  variables  L,  , 

v 

Gv**y*gv»K,k  for  the  vehicle  and  ^-'T»  Gt'*T'  gT  for  the  thruster-  To 
aid  in  the  bookkeeping  in  the  transformations  and  to  achieve  a slight 
realignment  of  the  equations,  the  following  notation  is  introduced. 


L\r  - Lio 

Gv  = l20 

K = L30 
= -L40 
= -L5  o 

also,  let  F^  = Fx  such  that 

Ft  = - Fx  + F2 

The  equations  of  interest,  equations 


Uo 

= 9F i /0fjo 

L>2  0 

= 9F i / 9fz  o 

L30 

= 9Fi  / 91 30 

• 

L»40 

= 9F i /9f40  - 

9F2  / 9f40 

*v  = *« 
gy  = *20 
k = i30 

" *40 

g>p  “ *5  0 

5,  then  become 
iio  = -8Fi  / 3L»io 

*2  o =-9Fx/9L,20 

(5a) 

*30  = -9Fj  / 9L30 

*40  = - 9Fi/aL40  + 9F2/9L40 


Lso  = 9Fi/9I50-  *so  = - 9Fx /9L50  + 9F2 /9L50 

9F 2 /9f5  o 


The  equations  for  the  first  transform  are  obtained  by  neglecting 

the  term  F2  in  the  above  expressions.  Lj0  = 9Fj/9ij0;  fj 0 = - (6) 

9Lj  o 

Then  expressing 

8 (j  = 1 » . . . 5) 

Fi  = F10  + Fn 
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where  Fi0  is  the  part  of  Fi  that  does  not  contain  the  small  parameter 
e ! and  may  be  seen  from  the  function  expansions  to  be 


F - -H— «■  _ H 

Fl°  2L10 


(V) 


10  = Ft Z - — - aL30 

The  term  Fn  consists  of  all  terms  in  Fj  which  contain  the  small  para- 


meter «i  and  may  be  expressed  as 


n 

Fi  i = P + ZJ  Qoi  cos  Go  i 
° i=i 


(8) 


where  PQ  is  the  part  of  Fn  that  contains  no  periodic  .terms  and  may  be 
seen  to  be 

P = - (.85)  ti  (Lap4  + L40  4)  1^2 
o 

Qoi  represents  the  coefficients  of  the  periodic  terms  and  as  may  again 
be  seen  from  the  expression  for  F^  in  the  expansion  section,  the  Q0i 
are  functions  of  the  small  parameter  ej  and  L,10,  L20,  L40  and  L5  0 only. 
The  cos  eta  are  the  periodic  terms  where  the  6oi  are  given  by 

Gb  i = pi  1*10  + P2  1*20  + P31*30  + P4  1*40  + P5  1*5  0 (?) 


or 


(9a) 


001  - ^ P:  if;  0 

j =1  J J 

and  n is  the  number  of  periodic  terms  to  be  considered. 

The  procedure  now  is  to  transform  the  Hamiltonian  of  this  part  of 
the  problem,  Fi  , into  a new  Hamiltonian  which  is  independent  of  the  angle 
variables  such  that 

F 1 ( Lx  q,  q , ■L'30f  ^40»  ^-*5  0>  ^1 0 > ^ Z Or  ^30»^40>^5o)  (^-0) 

= Fi  * ( Li i , , ^31  , , L5  x) 

where  Ln,  L2  x , etc.  , represent  the  transformed  variables. 
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To  aid  in  the  transformation,  a determining  function 


S-S(L.ji,!j0)  ...,5) 

may  be  used  following  the  procedures  of  the  Hamilton- Jacobi  theory. 
The  equations  of  transformation  are  then 
Ljo  = 3S/0fjO 


(j  1»2  ) • • • » 5 ) ( ^ ) 

iji  = 8S/8Lj4 

and  the  terms  of  the  Hamiltonian  become 

Fio  = F10  OS/8110,~-  ,—■) 

°*30  d f 4 0 

„ , as  as  as  as 

Fix  - Fxi(  lx  o,  l z Of  * 3 0 > *4  0»  *5  0 ) 

0*1  0 dj?2  0 dJE 4 0 0x5  0 

The  determining  function  may  be  expanded  in  powers  of  the  small  para- 
meter cx,  as 


S — Sq  + Sx  + S2  + - - — 

where  S0  does  not  contain  € 1 , S\  is  first  order  in  € 1 , S2  is  second  order  etc  * 
To  insure  an  identity  transformation  in  case  all  Q0x  happen  to  be  zero,  S0 
must  be 


So  = T»xxi10+L2xf2o+L3xi30+L4xf40+L»5xi50  (12) 

or  S0  = ELjxf  j0 

The  transformed  Hamiltonian  may  also  be  expanded  in  powers  of  the  small 

parameter  as 

Fx*  = Fxo*+Fxx*  + Fx2*+ ( 1 3.) 

where  Fxo  * is  of  zero  order  in  , Fx  x * is  of  first  order,  etc.  Substi- 
tuting the  relations  for  S and  Fx*,  the  Hamilton- Jacobi  equation  becomes 
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upon  substitution  of  equations  1^,  and  1 ^ in  10 

F.Ilji  +(f>-  +fi+...)j+F1  L,  + <ffL+ff*-’  + ...), 

L_  &*jo  3*  jo  J L 9jfjo  9i  j o V 

6biJ  = Fi0*  + Fn*  + (14> 

(j  = 1 » Z »•  • - 9 5 ) 

F0  and  Fi  may  then  be  expanded  in  Taylor's  series  which  to  the  first 
order  in  € i become 

F»  = FoU.J1)  + f ff^j 

*T 


F i = Fi  (Lrj  1 , 0 01 ) 


(j  = 1 i 2 i • • • » 5 ) 

(i  = 1 » 2 i*  • • ) 


Substituting  these  series  into  the  Hamilton- Jacobi  equation  and  equating 
terms  of  like  order  in  ei  gives 

FjofLji)  = F ! o*  = - ^A.  -ttLji  (15a) 

S jp1-  + Pi  + SQliCos0oi  = Fn^Lji)  (15b) 

j 3Lrj  o T 3fj0.  i 

1 

8 Fi  o 8 Fj  o 

The  notation  — — “ denotes  — — “ evaluated  at  Lj0  = Lji  , Pi  and  Qa 

dTrjo  ^ dfjo 

denote  the  functions  Po  and  Qoi  with  the  L j i substituted  for  the  b j0.  Now, 
since  Fi  x * is  a function  of  the  Lji  only  and  since  Si  and  S Qi  A cos  0 01  are 
functions  of  the  l j , Fi  i * can  only  be  related  to  the  term  Pi  • Hence, 
FuMLji)  = Pi  = - (-85)  «i  (Ln4  + L4i4)1/z  (16) 


9Fx  o 9Sx 


j 9LJ( 


S Qi  i cos  0 
1 


j i 

Returning  to  the  expression  for  Fi  0,  equation  ( 7 ),  and  introducing 


notation 
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, VJ4  0 


(18) 


mo  = 

J-m  o 


2 2 


4 0 


The  required  derivatives  maybe  evaluated  as 

BFio 


8Li  o 

9 Flo 


t = - r1 S 

1-11  MO 


- Ul  o 


Ll  1 


0L 


2 0 


= 0 


8Fi 


3L 


30 


J2  1 


Lai 


- a = - U31 


Lil 


:-uu 


9Fi 


0Lj4  0 
0F 1 Q 


F4 1 1^40^ 


L4] 


- - U4  d 


-4  1 


= - l>41 


0 Lr5  0| 
or  in  general 

8Fxo 


-5  l 


3 F 


jo 


JX 


Jjl 


The  equation  for  Si  , equation  (17),  then  becomes 

n 


3SX  , 9Si  . dSl  ^ ~ 

xni  77“  + <*  77”  + u4i  — — = ? Qii  cos  0oi 

of  10  9*3  0 9*4  0 i-i 


or 


Su  3SX  /aT  = SQii  cos  0oi 
j ji  / 0F' j 0 ! 

A solution  for  this  equation  may  be  taken  in  the  form 
Sx  = S An  sin  0o  i 


where  the  Ax  i are  not  functions  of  any  ijo#  Hence, 


0Si  00o  i A « 

— - = S — — LAji  cos  00  i 

o£  j 0 i o£ ) 0 


(19) 
(19a) 

(20) 
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The  particles  of  Q>i  required  are  obtained  from  the  expression  for 
equation  (9),  as 


90 


01  _ 


Pu  ; 


ae 


9*3  0 


01  _ 


P3i  ; 


99 


01 


9*4  0 


P4i 


9*io 
or  in  general 

99qi 

9*  j o " PJi 

Substitution  of  the  assumed  solution  into  the  equation  then  gives 

2 Ai{  Evjipi  i cos  0Oi  = ^Qji  cos 
i j i 

Equating  coefficients  of  like  cosines  yields 

Qi  i 


Aii  = 


Hence 


S ji  P j i 
j 


( i = i , • . . , n) 


n 


Si  = S 


Oil 


1=1  S^pji 


sin  0 oi 


(21) 


(22) 


With  these  values  for  S0  and  Sj  , the  determining  function  to  the  first 
order  in  ei  becomes 

S = SLjil)0  + S sin  0oi  (23) 

j i pVJiPJi 


The  equations  of  transform  then  give 

9S 

Ej0  — = Lji  + SAU  pji  cos  0Oi 
ox  j o i 

(j  = i 5 ) ( 24) 

a A 

*ji  = 9S/8L/J1  = * jo  + S sin  ®oi 
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The  complete  first  order  Hamiltonian  function  for  this  part  of 
the  problem  may  now  be  written  in  the  new  variables  as 

Fi*  = ^2  - ^3^2  - aL-3*  - .85* i (L«n 4 + L4i4)  ^ 

The  cprresponding  equations  of  motion  are  then 

Lj,  = 9 Fj  */9l  j,  ; Iji  = - SFi^/aLj,  ()=,,... 
The  solutions  of  (29)  may  then  be' written 
Lji  = aj 


(j  = !»•••  *5) 


i ji  = f j t + bj 

where  aj  , fj  and  bj  are  constants  . 


# 5) 


(28) 

(29) 


(30) 
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Substituting  the  values  of  fjo  from  these  equations  into  the  expression 
for  0oi  equation  ( 9a  ),  gives 


9oi  = ? PJ  i 

j "I 


n 


3 A 


i*  j i - 2 TT Sin  001 

i=i 


Upon  defining 


0oi  becomes 


0i  i = 2 Pj  i*ji 


0oi  - 0 1 i 


z Pi  i ? 

j=l  k-l  0Ujx 

where  the  index  of  the  second  summation  has  been  changed  to  avoid 
confusion.  This  expression  may  then  be  written 


3At 


sin  0 ok 


(25) 


(26) 


001  =011“  ? Pj  i 
J 


i~l 


^ 0Ai  k 
k=l  0Ljl 


n 


sin^)k+  S 


3A 


i k 


:=i  + l 9Ljl 


sin  0 ok 


v 9AM 

- £ pj  i r~r sin0Oi 

j =1  0U  j! 


which  is  in  a form  to  which  the  Lagrange  expansion  theorem  is  applicable. 
Applying  this  theorem  and  performing  the  necessary  simplifications  gives 
the  values  for  cos  0qi  and  sin  0oi  needed  in  the  transform  equations. 

cos  Gbi  = cos  Qi  i + (terms  of  first  and  higher  order  in*i) 
sin  0oi  = sin  0i  i + (terms  of  first  and  higher  order  in  c l ) 
Then,  since  Ai  x is  itself  a quantity  of  first  order  in  € x , the  transformation 
equations  become 


Lj0  = Lji  + 2 Ai  t pji  cos  Gj  jl 


j “ 1 > Z > * • • 5 


(27) 


^jo  = *Ji  - S E sin  Q,  where  E = 3A../0E 
i ji  ^ 1 ji  11  J1 

These  expressions  are  of  course  a great  deal  more  complicated  when 
higher  order  solutions  are  sought. 
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DEVELOPMENT  OF  AN  APPROXIMATE  CANONICAL  FORM 


Substitution  of  the  solution  equations  30  or  the  transformation 
equations  27  will  not  yield  equations  in  a canonical  form  in  F2  , di- 
rectly. Therefore,  it  is  necessary  to  make  further  small  order 
approximations  to  obtain  equations  in  a form  suitable  for  further 
application  of  the  procedure.  To  illustrate  this,  and  to  provide  a 
somewhat  simplified  outline  of  the  developments  performed  in  the 
first  transformation,  consider  the  equations  of  motion  5a  in  the  fol- 
lowing form: 


L’  = dFi/dl 
po  po 

. 8FX  _ 9FZ 

qo  BJt  dl 

qo  qo 


i 

f 

po 


I 

qo 


9FX  /9L 

po 


9Fi  / 9L  + 9F2  /9L 

qo  qo 


(31) 


(p  = 1. 2,  3)  (q=4,  5) 


F,  = Fx  (L  ,i  , L A ) 
po  po  qo  qo 


F2  = F2  (L  ,i  ,L  ,f  ) 
po  po  qo  qo 


The  technique  followed  so  far  has  been  to  obtain  solutions  to  the  equa- 
tions obtained  by  neglecting  F2  . 


L = 9Fx/9f. 
jo  jo 


j° 


-_9Fi_ 

8L. 

jo 


(32) 


(j  = 1 — 5) 

The  solution  to  these  equations  were  found  by  solving  the  Hamilton- 
Jacobi  equation 


Fx  (f . ,9S/9f.  ) = F*  (L.  ) 
jo  jo  ji 


(33) 
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where  S is  the  determining  function 


S = S (L.  ,1  ) 

Ji  jo 


The  equations  of  transformation  were  then  obtained  from 


L = 9S/91 
J°  j° 


l = 9S/9L 
Ji  ji 


which  gave 


L = L.  (L  J ) 
jo  jo  ki  ki 


L,  + 2JA  p cos  9 
Ji  i ii  ji  oi 


1.  = f.  (L,  . ) =•  1.  -SE„  sin  0 . 

jo  jo  ki  ki  ji  i ji  oi 


Taking  the  total  time  derivatives  of  these  equations  and  substituting 
into  the  equations  of  motion  31 


5 3L  , I 

2 aT^  L + KT22'  1 = aF»  I™ 

ri  di  rj  po 


(3  6a) 


5 

91 

di 

Z 

P° 

L 

+ 

-£2  , 

3 Li 

ri 

di 

r=i 

_ rl 

ri 

5 

~9L 

3L  . 

z 

a 2 

L 

+ 

—22t 

3L 

ri 

31  i 

r=i 

_ ri 

ri 

! 

5 1 

~8| 

di 

1 

z 1 

-22 

♦ 

L 

+ 

__a?  ; 

9L 

ri 

dt 

r = i 

ri 

ri 

(P 

= 1,2, 

3) 

(q 

= 4,  6) 

= - 9FJ/9L 


9Fi/9f  - 3F2  /3f 

qo  qo 


(36b) 


( 36c) 


9F!/9L  + 3F2  /9L  (36d) 

qo  qo 
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Multiplying  each  of  equations  36a  by  3 f / 9L  and  each  of  equations  36c 

po  ki 

by  84  /81^  where  is  one  particular  L and  adding  gives 


ri 


r=i 


3L  3 i *5  9L  94 

s — ^ — E°  + s a°  — 

9L  3L  9L 

P=i  ri  kj  q=4 


3L 

rj  kj 


L’  + 
ri 


3 

9L 

94 

c 9L 

Bt 

2 

22. 

— 22.  + 

s —35 

-32 

a 

84 

3L 

de 

3L 

ri  / 

El1 

ri 

ki  q 

”4  ri 

ki 

j 



3 

3F ! 

di  5 

9Fi 

di 

9F  z 

2 

84 

+ 2 

9 4 

-32  . 

di 

P=i 

po 

3L- 

ki  q 

__  q° 

ki 

qo 

_qo 
"ki 


(37) 


Multiplying  each  equation  36b  by  9L  ^/3L.^  and  each  of  equations  36d 
by  9L  ^/3L^  and  adding  gives 


5 

2 


f 

rr 

94 

9L 

5 

di 

92 

s 

HP 

__££  + 

2 

-32 

as 

) 

3L 

3L> 

3L 

92 

L 

P=1 

ri 

ki 

q=4 

ri 

ki 

ri 


3 

92 

po 

92 

po 

5 

+ z 

q=4 

Of 

qo 

92 

qo 

s 

2 

P=i 

di 

ri 

ki 

di 

ri 

9L 

ki 

3 

2 

8F  i 
02 

92 

E2 

5 

- 2 

9Fj 

92 

92 

P=i 

po 

ki 

q=4 

qo 

9F? 


(38) 


3L 


+ Z 3L  qo/  3L 

qo  ki 

q-4 
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Multiplying  each  of  equations  36a  and  36c  by  91  /9f  and  31  /3 1 

po  ki  qo  ki 

respectively  and  adding  gives 


( 

3 

9L  91  5 

9L 

91 

) 

2 

E £ __E2  + 2 

qo 

qo 

T i 

\ 

/ 

9L  91 

9L 

01, 

JLi  T 

rl 

1 

P=i 

rx  kx  q-4 

rx 

kd 

5 

2 


5 3L  31 

+ Z T4a  -32  I 


'\ 


81  81 

q=4  ri  kx 


ly  and  adding  gives 


J 


3 0Fx‘  91 

= S «T“  -B2 

po  91 

P=i  ki 


3 9L  91 
2 E2  _E2 

91  9f,  Ul\ 

|P=i  rx  kj  (41) 


5 9F2  91 

s 

91  91. 


91 


r=i 


01  0L/  5 

PQ  ££  , s _2° 

dL  91,  9L 

|p=x  rx  ki  q=4 


9L  . 

q° 


01, 

rx  kx 


5 

9Fx  91 

2 

31 

- _qo_ 

q=4 

qo  dlki 

-j 

9L 

_E£ 

1 

and  *7"^“ 

kx 

^kx 

3 81 

L 

+ 

2 — 

rx 

91 

P=i 

0L 

P°  £2 

0*. 

rx  kx 


+ 2 


5 91  9L 

qo  qo 


91  91, 

q=4  rx  kx 


- L . 

ri)  po  91 

J P=i  ki  q=4 


3 — 9 F*!  9L  5 -8FX  dL 

-2^-+  2 9L  — ^ 


2 9L 


qo  91 


kx 


9F7  9L 


+ 2 9L 
q=4 


_qo 


qo  91 


kx 


subtracting  equation  41  from  equation  42  then  gives 

/ 8F| 


9L 


(42) 


0L  9Fx  91 


q=4 


/ 9F x 

9L 

9Fx 

91  \ 

5 / 0F 2 

9L 

9F2 

91  \ 

( 9L 

-^  + 

91 

qo  1 

+ 2 1 9L 

q°  + 

91 

_qp 

\ q° 

81. 

kx 

q° 

’ q=A  q° 

0i, 

kx 

q° 

V 

£2+  7“  P° 
po  91  9^po  91 

kx  kx 


(43) 


To  evaluate  the  brackets  it  is  necessary  to  transform  the  determining 
function  by  means  of  the  transform  equations  (35) 
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5 


(44) 


S(L  ,1.  ) = S'(L.  ,1.  ) j = 1 
Ji  J°  Ji  Ji 


The  partial  derivatives  of  S'  may  then  be  expressed 

s 9S  5 3f 

=i  + s af  —is.  = j?  + s l,  — i 

3L/  ki  j=i  jo  0L  ki  j=i  jo  3L>. 

kj  J ki 


L 


3S'  5 as  dl 

dl  . jo  3j l 

ki  j=i  ki 

rewriting  the  brackets  as 

r 


5 d*. 

12.  = 2 — 12  L, 

' . de  jo 

ki  J =i  ki 


„ — 3 dt  5 M I 

|Lk, ' LJ  = Jl.  2 V ^ + S V sS2 

-J  ki  p=i  ri  q-4  ri 

rr  9i  5 8i  — i 

- — 2 L,  — 22-  + 2 L 

3L  do  3L,  qo  3L, 

ri  p=i  ki  q=4  ki  ^ 

- — 3 | 3 31  5 3 e 

V'«  -air  2 v iT-  + 2 \o 

L J ki  [p=i  ri  q»-4  ri 

„ fT  8i  5 8i 

- JL  S L -j32  + S L ry32- 

3<  p=,  P°  9Lk,  q=4  3°  9 k. 

ri  - — J 


— S L — E2-  + 2 L — 

3f  po  3jf  qo  dt 

ki  p=i  ri  q-4 


ft  3 8 1 5 81 

— — ^ L,  —2^  + S L — ! 

d£  S po  81,  24  qo  3i 

n Jp=i  r ki  q-4  1 


* LL — 

and  substituting  the  derivatives 


, 9 9S'  . X 

* ~ ZT~  ai " f,  (46a) 

n dLt  oL  ki 

ri  ki 
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' ~-|  = _9_  _9S^ 

fkj  ’ ri  " 91,  9f 

— J ki  ri 


d_  9S' 

di  91, 

rj  ki 


(46b) 


[L  -n  _ _9_ 

ki  ’ n 9L  91 

_ J ki  n 


91  9L  k! 

ri ki  


(46c) 


Then,  since  all  L.  and  1.  are  independent  variables , equation  46a 
Ji  Ji 


gives 


equation  46b  gives 


U.  ,i  =0 

U51  riJ 

and  equation  46c  yields 
__  91. 


R •*! 

Lki  rii 


~!  - 


0 r t k 


1 r = k rk 


K > M = - 6 v 

j ki  n|  rk 

Substitution  of  equations  47  into  the  equations  40  and  43  gives. 

3 /9Fj  9L  9Fj  91  \ 

l = - s ( 9L  — ^ + 9f  -r29  J 

rx  ^ 1 po  3L»  po  3L  / 

P=i\  rx  r ri  / 


p=i  V n rx  / 

5 / 3Fx  3L  3Fx  31  \ 5 /3F2  3F  3F2  3? 

21  [ 9L.  "3-°  + 9 I ] + s ( 9Ij  aT  + 9f  ■I^9' 

\ qo  3L»  qo  3L<  I I qo  oL,  qo  3Lr 

q-4  \ rx  rx  / q=4\  n rx 


3 

f d F| 

9LP°  + 

9F 

2 

P=i 

i 9L 
^ po 

di 

ri 

9£ 

po 

9F  i 
a t 

92 

qo 

9F z 

4-  pifl 

9 i 

qc 

0 JL 1 

qo 

9f 

ri 

• OJt 

qo 

9£ 

ri 

5 /9Pj_  9L  9Fj_  9^p 

S 9L,  —32.  +9|  9f 

\ qo  9f  _ qo  rj 
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Now,  the  transformation  of  the  previous  section  transformed 


Fi  (L.  , i.  ) = Fi  * (L.  ) 

jo  jo  jl 


(49) 


This  of  course  is  a special  case  of  the  more  general  transform 


Fj  (L.  ,f.  ) = F!  * (L.  ,f.  .) 

jo  jo  Jl  Jl 

Derivatives  of  Fi  * may  then  be  obtained  as 

, 5 I 0F  9^-*.  9F  9f. 

ggi  * _ * i 111  12  + I 15 

3L,  * 9L  9L  81.  9L 

j=i  L J° 


and 


ri 


8Fx  * 
9f 

ri 


ri 


jo 


ri 


S 

j=i 


9F  9L  3Fj  91. 

L 12  + - 12- 

9L  9f 
jo  ri 


de  3 1 

JO  ri 


(50) 


(51) 


9 F i * . 

where  it  is  recognized  that  = 0 for  the  special  case  of  equa- 

ji 

tion  49,  but  the  form  of  equation  51  is  used  here  to  maintain  sym- 


metry. 

Then  combining  the  summations  over  p = 1 to  3 and  q = 4 to  5 

into  a single  summation  over  j = 1 to  5 in  equation  48  and  substitution 

of  equations  51,  the  following  form  is  obtained. 

9F  * 


ri 


9L. 


ri 


5 / 9F  dL 

/ i a? 

S 9L  9L, 

q=4  V q°  n 


9F  91 

+ wr1  — 32. 


qo  3L 


ri 


9F 


t 

L = + - 

n 31 


9L 


ri 


5 /3Fz 

2 ( 9lT  9f 

q=4\  q°  ri 


22 


3F 


3f 


q° 

de  dt  ’ 

qo  ri 


(52) 
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The  same  transformation  must  now  be  applied  to  F2  . This  yields 

Fz  <W  ^ Fz ' (Ljrij1 } (j=1’— 5)  (53) 

The  necessary  derivatives  from  equation  53  are  then 


9Zil 

9L 

ri 


9FZ  ' 

d£ 

ri 


= 2 


5 /9F  9L  9F  9f . 

' z jo_  2 — _jc 

9L.  9L.  9 i.  9 L 

j=i\  jo  rx  jo  ri 


5 i 9F,  9L;, 


(54) 


Z ( 9L  2 
J=i\  jo 


9f 


ri 


9_iio 

9f . 91 

jo  rx 


Now,  referring  to  the  transform  equations  (35)  and  remembering 

from  the  previous  section  that  all  A . and  E..  are  terms  of  order 

i1  Ji 

«x  (Oex  ),  it  is  seen  that  the  derivatives  of  the  old  parameters  in 
terms  of  the  new  may  be  expressed  as 


9L. 


9f 


= o^x 


rx 


9L. 


9L 


j°  9 


rx 


jr 


9f . 


m = 6-  + °€l 

91  j r 

rx  J 


9*. 

+ 0€x  = Oe x where  6.= 

O-L 

ri 


(55) 


Jrli  j = r 


Further,  the  function  Fz  contains  a multiplier  u2  /L  6 which 

10 

is  always  a small  quantity  of  order  less  than  even  though 
it  is  not  a cpnstant.  Hence,  by  neglecting  products  of  these 
two  small  quantities  equations  (54)  may  be  expressed 


9FZ' 

9L 

rx 


9FZ 

9L 

ro 


(r  = 1, 5) 


(56) 


9F  i 9FZ 


9£ 


rx 


9f 


ro 
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Then,  substitution  of  the  relations  (55)  into  the  equations  (52)  and  again 
neglecting  products  of  the  small  quantities,  the  equations  of  motion  may 
be  expressed 


L 

ri 


ft 

ri 


8FX  * 

ai 

ri 


5 

2 

q=4 


9F2 


81 

qo 


6 

qr 


-9Fj  * 


9L 


ri 


+ 


5 

2 

q=4 


9F2 

8L 

qo 


6 

qr 


r = 1 , 5 


(57) 


Then,  substituting  equations  (5  6)  and  taking  advantage  of  the  properties  of 
the  Kronecker  delta,  equations  (57)  may  be  expressed  in  expanded  form. 


Lu  = 3Fi*/9j?ii 

<n  = 

- 9Ff/3Lu 

L2i  = 3Fi*/9f2i 

<21  = 

- 3Fi>:</3L-2i 

L3i  = 9F|*/9l3i 

<31  = 

- gF^/gLj! 

(58) 

Ui  = dFi*/dl4l- .,Z 
0*4  1 

L51  = 8F,*/8I51  2 

0*5  1 

<4  1 = 

- 

- 3Fi*/9L4i 

- 9Fi*/9 L5  | 

3F 2 ' 
9Lii 

+ 9FV 

94, 

Equations  (58)  are  the  new  equations  of  motion  to  be  solved.  Now  note, 
that  Fj  * of  equation  28  contains  none  of  the  terms.  Hence,  the  first 
three  equations  in  the  left  hand  column  of  equations  (58)  become: 

L»n  = 0 ; 1j2  \ = 0 ; E3x  = 0 (59) 

from  whence 

Lu  = ai  = const.  ; L»2i  = a2  = const.  ; L31  = a3  = const. 

(60) 
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Similarly,  the  second  and  third  equations  in  the  right  hand  column 
of  equations  (58)  become 


a = 0 ; i3i  = <* 

2 I 

where  a is  a previously  defined  constant.  Hence, 
f 2 l - cz  ~ const.  ; f3i  = a t + c3 


(61) 


(62) 


The  first  equation  in  the  right  hand  column  of  equations  (58)  becomes 


Jhi  - 

Hi 


1.7Ln 


1 +(  1^41 


/ L'll  ) 


n 


(63) 


To  continue  further  with  this  approach,  it  is  necessary  that  equation 
63  take  the  form 

ill  = P (64) 

where  P is  a constant.  The  appearance  of  L41  in  the  second  term  of 
equation.  63  thus  produces  considerable  difficulty.  Since  L4i  is  related 
to  the  Lagrange  multipliers,  it  will  in  general  be  unknown.  However, 
it  might  be  noted  that  if  L4  x » Ln  , the  second  term  will  be  much 
smaller  than  the  first  and  as  such  may  be  neglected.  On  the  other  hand, 
when  L4i«  L^  \ the  second  term  will  be  of  O—  as  compared  with  the 
first  term  of  Ol . Neglection  of  the  second  term  under  these  conditions 
is  hardly  justified  and  it  will  be  necessary  to  assume  some  constant 
value  for  L4  i in  equation  63. 
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The  procedure  from  this  point  on  must  then  be  considered 
iterative  in  an  actual  calculation.  A good  first  choice  for  (3  will 
probably  be 

(3  = H2/ax  (65) 

where  ai  is  the  constant  of  equation  (60).  The  problem  must  then 
be  solved  and  the  resultant  range  of  L41  and  the  corresponding  range 
of  the  neglected  term  in  equation  (63)  examined.  If  this  neglected 
term  does  not  remain  small  compared  to  p2  /ai  a new  P must  be 
chosen 

P = p2  /af  - €1  1.7L-n  ^1+(L4i /Lu  )^j  1 ^2  (66) 

where  L»4J  is  some  averaged  constant  value  from  the  range  of  L4i 
previously  calculated.  The  procedure  must  then  be  repeated  until 
the  variation  in  L4i  is  negligible. 

Returning  now  to  the  remainder  of  the  problem,  with  fii  ex- 
pressed as  equation  64,  fn  becomes 

fn  = pt  + Ci  (67) 

The  relations  for  L>i  i , L21  > , fn  , f2i  , -^31  from  equations  60,  62 

and  67  may  then  be  substituted  into  Fi*  and  F2'  and  a new  function 
defined  as 

H 1 JX_/4 1 , L5  1 , f4l  , f 51  , (^l  > *^3»®»P»Cl,C2  , C3  ) , 1 1-  ( Fl  - F2  ) 

J after 

(68)  substitution 
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(69) 


and  the  equations  of  the  problem  become 

l41  = 9H'/9*4i  f41  = - aH'/9L41 

L5i  = 9H'/9*5i  hi  = - dH'/dL51 

However,  to  apply  the  same  procedure  as  used  in  the  first  trans- 
form, it  is  necessary  to  remove  the  explicit  appearance  of  time 
from  the  Hamiltonian  of  the  problem.  To  do  this,  it  is  necessary 
to  introduce  accessory  variables  and  define  a new  Hamiltonian  as 


H = H*  - PL-61  - aLvl 
The  equations  to  be  solved  are  then 


(70) 


L*i 

= 9H/9f4 

*41 

L5  1 

= 9H/9f5 

*51 

Lu 

= 9H/9f6 

*61 

L71 

= 9H/9f7 

*7  1 

9H/9L41 


(71) 


9H/9L.7  i 
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the  second  transform 


The  part  of  the  forcing  function  that  was  neglected  in  the  first 


transform  may  be  expressed  in  the  form 


n 


= - m T (E.  ) + S R (L  ) cos  41 


"10 

(j  = 1.  - 


j° 


r=i 


or 


J° 


or 


(72) 


5) 


whe  re 


* 


or 


5 

2 

j=i 


w.  f . 
Jr  J° 


Substitution  of  the  first  transform  relations  of  equations  (27)  yields  the 
terms  in  F2  in  the  following  form. 


T (L  ) = T,  (L  ) + 2 U (L.  ) cos  0 

o jo  ji  u 1 u jl  xu 


R (L>  ) = R (L.  ) +2  V (L.  ) cos  0 

or  jo  1 r jx  v xv  jx  1 v 


L?o 


- 6 A 

= En  (l-6Sp  -7^ — COS  0 .) 

i l i E 1 1 

11 


cos  ib  = cos  lb  + — 2 5 w.  E..  [cos  (0  . ) 

or  i r 2 j 1 jr  ji  L_  l 1 i r 

- cos  (0  . + i|i  Tl 

li  l r J 


Where  Ti  and  Ri  r are  identical  expressions  to  T and  R with  the 
* o or 

L replaced  by  L.  , and  ib  is  identical  to  ib  with  the  It.  replaced 
jo  p y Ji  ir  Yor  jo 

by  H . U and  V like  A , are  terms  of  first  order  in  € i . 

Ji  l u iv  11 
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Rearranging  and  reordering  the  cosine  terms,  Fz  may  be  expressed 


as 


F2  = - b (ti  + H B (L.  ) cos  <)> 

L-n  U h=i  i h ji  ihj 


(73) 


where 


4> 


lh  j=l  jl 

Substituting  the  solutions  of  equations  60,  62,  and  67,  incorporating  the 
auxiliary  variables  and  denoting  F2  1 , T\  and  Bi  after  the  substitution 

as  F2  *,  Ti  Bi*,  and  |jL2/ai6  as  € 2 , the  expression  becomes 

H m * 

F 2 * = - € 2 I Ti  * ( L4  i , L<5  i , ai  , a2  ) + S B ( Lr41  , B5  i , ai , a2  ) 

i i K 

(74) 


k=i 


- cos  4> 


u] 


where 


<t>  S q I . + q . c2 

ik  ik  n 2 k 

1=  4 

* 

Performing  the  substitutions  in  Fi  and  adding  the  auxiliary  terms, 


the  Hamiltonian  for  the  problem  becomes 
2 2 


H = " ^"[72  - aa3  - .85ei  (a4  + L.4,4)1^2  - pL61  - aL7  j 


+ € 


41 

m 

+ £ 

k=i 


B'1'.  cos  <b 

1 k 


'.3 


(75) 


The  equations  are  given  as  equations  (71)  which  may  be  expressed 


L = 3H/9I.  I.  = - OH/OL  (j  = 4, ,7)  (76) 

Ji  Ji  Ji  Ji 

The  second  transform  now  follows  the  same  procedure  as  the  first 
transform.  e2  is  the  small  parameter  and  H may  be  split  into  two  parts, 
H void  of  € 2 , and  Hi  all  terms  of  which  contain  € 2 . 
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H 


2 4 /•  l Iz 


o ZL  2 

41 


— 2 - . 85  € i (ai  + L4l4 ) - PLbi  - 


7 i 


(77a) 


where  the  terms  -aa3  have  been  neglected  since  they  don't  affect 

Zai 


the  solution. 


m 


Hi  = Xi  +.  S Yi  ^ cos  4^  k 

k=i 


(77b) 


where 


Xi  = € 2 Tj  and  Y 


, = € 2 B ' 

i k 2 i k 


The  object  now  is  to  transform  the  function  H(L».  ,1 , ) into  a function  of 
J Jl  Jl 


the  L.  only 
J2 

H(L  ,f.  ) = H*(L.  ) (j  = 4, 7)  (78) 

Ji  Ji  J2 

As  in  the  first  transform,  a determining  function  will  again  be  used 


in  the  form 


S = S(L.  tl.  ) 

J 2 JI 

which  gives  the  transform  relations 

L = 9S/8I.  i.  = 3S/9L.  (j  = 4 ---  7)  (79) 

jl  Jl  JZ  JZ 

The  parts  of  the  Hamiltonian  inequations  (77)  then  appear  in  functional 

form  upon  substitution  of  equations  (79)  as 

H = H (dS/dl . ) Hi  = Hi  (I7  , l.  ) (j  = 4 7)  (80) 

00  ji  9*  Ji 

The  determining  function  may  then  be  expanded  in  powers  of  the-  small 
parameter  e 2 . 

S=S+Si+S2+ (81) 

o 
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Where  again,  to  insure  the  identity  transformation  when  c 2 is  zero, 


S is  taken  in  the  form 
o 

7 

S = U L l . (8 

° . J2  Ji 

J -4 

Expanding  the  new  Hamiltonian  H*(L.  ) in  powers  of  the  small 

parameter  e2  and  substituting  equations  (82)  and  (80)  into  (78),  the 


Hamilton- Jacobi  equation  becomes 


' — Jl  Jl  Jl  Jl 


, 3Sl  3Sn 


Jl  Jl 


<b  = H * + H * + H * + - - 
i k o 1 2 


H and  Hi  may  then  be  expanded  Taylor's  series  which  to  the  first 
o 


order  in  t2  become 


H = H (L.  ) + E ° 
° ° .i  , 


H!  = H!  (L.  ,<J>  ,)  <84) 

J2  1 K 

Substituting  equations  (84)  into  (83)  and  equating  terms  of  like  order  in 
€ 2 yields 


H (L.  ) = H * 
o j2  o 

L 9H0  3 i 

21  iTT  3j 

J =4  9LJ1 


(85a) 


X2  (L.  ) + Y (L>.  ) cos  4> 
J2  k 2 k j2  1 


= Hi  * (Lj2  > 


(85b) 


where  X2  and  Y are  identical  expressions  to  Xi  and  Y with 

k IK 


L.  replaced  by  L>.  . 
Ji  Jz 


The  necessary  derivatives  are  evaluated  from  equations  (77)  as 


an 


aL 


4 1 


= + 


L 


- 1 . 7c  » (a?  + Uz) 


- 1/2 


2 - - n42 


42 


Jz 


9H0 

3L 

5 1 


aH 


= 0 = - ns , 


Lj2 


(86) 


8L 


6 1 


“ P “ " n62 


8H 


8L7  i 


J2 


L 


- « = “ n7  2 


J2 


Using  the  notation  of  n.^  as  given  in  equation  (86),  the  derivatives  in 


summation  form  may  be  expressed 


7 bhJ 

aL 

J=4  J1 


aSi  7 

— - = s 


as, 


n. 


Bl.  . jz  81. 

Ji  J=4  Jl 


(87) 


Jz 


Referring  now  to  equation  (85b),  it  is  seen  that  the  second  term  X2  and 

the  right  hand  side.  Hi  *,  are  both  functions  of  the  L.  only  while  the 

other  two  terms  are  functions  of  both  the  L.  and  the  f . Therefore, 

Jz  Ji 

the  Hi  * is  only  related  to  X2  . 


Hi  5l<  (L,  ) - X2  (L,  ) 

Jz  Jz 


(88) 
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Substitution  of  equations  (87)  and  (88)  in  (85b)  then  yields 


i.  ^ Y2  k (L.  ) cos  <j> 

J2  “h  k j2  Tik 


A solution  of  equation  (89)  may  be  taken  as 

Sj  = ^ C (L.  ) sin  4> 

k 2 k jz  ik 

Substitution  of  equation  (90)  into  (89)  then  gives 

7 

2 S C q n.  cos  <j>  = ^ y cos  <j> 

, 2 k jk  j2  ik  k 2 k i k 

k J=4 


from  whence 


(ljknj2 


Equations  (81),  (8Z),  and  (90)  then  give  the  first  order  deter- 
mining function 

7 v 

S = E L.  a . + C sin  4>  (93) 

• J2  Ji  k 2 k i k 

J-4 

Substitution  of  equation  (93)  into  the  transform  relations  (79)  then  gives 
the  transform  equations 


L.  = L.  + 2 C q cos  6 , 

Ji  J2  k Zk  jk  ik 


f.  = f . + 2 D sin  4> 


(j  = 4, ---7) 


J2  JI 


333 


whe  re 


»C2k 


* 

Substitution  into  the  expression  for  <|>  and  performing  Lagrange 
expansions  to  the  first  order  in  € 2 (as  was  done  in  the  first  transform), 
the  transform  equations  become 


L = L.  + S C q cos  4> 

ji  Jz  k 2 k jk  2 k 


l = f . 
jl  J2 


D sin  <[> 


(j  = 4,  ---7) 


whe  re 


= S <LJ.+  q kc2 

i-„  3 ]<J2  2 K 


The  complete  first  order  Hamiltonian  in  the  new  variables  may 
then  be  obtained  from  equations  (77a),  (85a)  and  (88)  as 


H*  (L.  ) = - 


4 , r 4, 1/2 


where  the  expression  X2  - e2  T2  has  been  incorporated. 
The  new  equations  of  motion  to  be  solved  are 


l - - 3H*/3L.  (j  - 4 ,7) 

j 2 J2 


These  equations  have  solutions  of  the  form 


L = a. 

J2  J 

f = b.t  + c. 

j2  J J 


(j  = 4, ---7) 
(j  = 4, ---7) 


Where  a.  and  c.  are  constants  as  well  as  the  b.  which  are  functions 
J J J 

of  the  a^  and  previously  defined  constants.  T2  is  obtained  from  the  ex- 
pansion of  F i in  a previous  section  by  incorporating  the  results  of  the 
first  transform  and  then  replacing  L41  and  L5  i by  L42  and  L5  2 . 

Tz  = ~ L424  + ai6  L42  (a1+a2)Zal  2 (L42+L52)2  (100) 

The  expressions  for  b.  may  then  be  obtained  from  equations  (97),  (98) 
and  (100)  as  : 


b4 


p2  1 . 7e  x a43 

a?"  (ai+atW2 


b5 

b6 

b7 


~TT  ^ 1 + af"^2  ^a4+a5  ^ (Za4  + a5  ) 

+ «z  a42  (1  + -J^)2  (a4+a5  ) 


a 


(101) 
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THE  SOLUTION  FORM 


The  preceeding  sections  have  explained  the  transformations 


necessary  to  integrate  the  differential  equations  (5)  or  (5a).  To  be 


useful,  it  is  necessary  to  express  the  original  variables  (L.^,0.^)  in 
terms  of  the  constants  of  integration. 


The  two  transform  relations  were  obtained  as  equations  (27) 


and  (95)  and  are  repeated  here  for  compactness  of  the  following 


discus  sion. 


L = L + 5 A . (L  ) p..  cos  4>  . 
jo  ji  i U Ji  Ji  U 

( j - 1 , 5)  (27) 

i = a,  - ^E,.(L,  ) sin  6 . 
jo  jl  1 jl  ji  H 


and  • 


Ji 


L.  + “ C (L.  ) q cos  4>  k 
k i k j2  jk  2 K 


J2 


L = Jf.  - f D._  (L.  ) sin  6 
Ji  J z k jk  j2  2 k 


(j  = 4, ---7)  (95) 


Also,  in  the  process  of  determining  a canonical  form  for  the  second 
transform,  the  following  relations  were  generated 

(j  - 1 , 3) 


L = a 
Ji  J 


fn  = ?t  + Cl  •-  *zi  = Cz  ; 5i  = at+  C3 


(102) 

(103) 
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Now,  denoting  a.  - L ( j = 1 , , 3)  to  aid  in  notation,  the 

J J2 


equations 


L =L  <j  = l.  — 3) 

Jl  J2 

may  be  considered  as  part  of  the  second  transform  equations. 

Next,  each  A ^ (L  ) may  be  expanded  in  a Taylors  series 

about  the  point  L = L to  the  first  order 
Jl  J 2 


(104) 


5 8A 

A .(L  ) = A .(L  ) + S xi 

1 1 Jl  2 1 J 2 777 

J = 1 9L. 

Jl 


(L.  -L.  ) + ---  (105) 

Jl  J 2 


J2 


The  terms  (L.  -L,.  ) are  determined  from  equations  (104)  and  (95)  as 
J 1 J 2 


(L.  -L..  ) = 0 

Jl  J2 


(j  = 1 » 1 3) 


(L,.  -L.  ) = SC  q cos  <b  (j  = 4,  5) 
Jl  J2  k 2 k^jk  ik  J 


(106) 


Hence 


5 0/^  . 

A = A (L».  ) + £ 2 LL  C q cos  <b 

ii  2 i J2  . k aL  2k  Jk  *k 

h 


(107) 


where  A is  an  identical  expression  to  A with  the  L replaced  by  L, 

2i  ii  jl  j 2 

Equations  (107)  and  the  first  of  (95)  may  then  be  substituted  into  the 

first  of  equations  (27)  to  yield 

E.  = L,  +2  c q 6.  cose))  +^A  p cos  0 
J°  J 2 k 2 k jk  j 2 k i 2 i ji  ii 

^ ( 5 3A  • *) 

+ 2 E l C ,q,,  cos  <}>  , p..  cos  0 \ 

1 (h=4  k 8L  2 k4 hk  \k  Fji  ,i\ 

h2  ' 
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Now,  noting  that  A . is  of  order  tj  and  C is  of  order  e2  and 
& 2 1 2 k 

neglecting  terms  of  order  «ie2  , this  becomes 

* 

L = L +SC,q  6 cos  * + S A p..  cos  0 • (108) 

jo  jz  k 2 k jk  j z k i 2i  jx  ii 

whe  re 

_ 10  j = 1,2,3 
j [l  j = 4.5 

Next,  from  equations  (99)  and  (101)  it  is  seen  that*6z  = *61  and 
h z = <?  i.  Also,  in  the  formulation  of  the  canonical  form  it  was 
specified  that 

*61  = Pt  + Ci  = *1 


and 


*7  1 = a t + Cj  = *31 

and  in  addition,  q = q , and  q = q , . Consequently,  it  may  be 
fek  l k 7 k 3 k 

specified  that 


,*12 


(108) 


*21  ~ *22 
*31  = *32 

are  transform  relations  replacing  the  *61  *7  1 relations  of  equations 

(95).  With  this  notation,  $ . becomes 

2 k 

♦2  k = b q.,  *•  <1Q9) 

1=1  lk  i2 
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Now,  returning  to  equation  (108),  9 . may  be  expressed 

5 

6 . = 2 p..  f. 

11  j=!  J1  J1 

substituting  for  the  from  equations  (109)  and  (95)  gives 
5 5 

£ p..f.  -2  p. . 2 D sin  <j> 


U ji  jz  • ji  k jk  7 k 

J-i  J-4 


Defining 


0 = 2 p..  f. 

21  j=r  J1  J2 


0 =.0.-2  2 D p_  sin  <(> 

“ **  j=4  k Jkj‘  *k 


(110) 


and 


cos  0 . = cos  0 + — 2 HD 

H 2 1 


.+72  ED  p..  [cos  (<t>  -0  )-  cos((|)  , +0  n 

i1  2 k jk  JX  L 7 k 2-i  7 k 2 1 1 

+ (terms  of  higher  order  in  e 2 ) 

Substitution  of  equation  (111)  in  equation  (108)  then  gives 

L>.  = L.  + EA  p cos  0 + 2 C q.,  6 cos  <t> 

J°  Jz  1 2i  ji  2i  k 2 k Hjk  j z k 


(HI) 


1 5 

+ — 2 S 2A 

2 1 h=4  k 2 


.P..D  Pu.  cos  (4>  -0  .)  - cosfo  +0  )} 

l ji  hk  hi  2 k 2 i '7  k 2i 'J 


or  again  neglecting  terms  of  order  € x €2  , 

L.  = L.  + 2A  .p..  cos  0 +2  C q 6 cos  <b  (112) 

J°  ' Jz  i 2i  Ji  2 i k 2 k jk  j 7 k 

Now  returning  to  equation  (27),  the  E^.  (L  ) may  be  expanded  in 

Taylor's  series  in  the  same  manner  as  the  A . 

ii 
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5 3E- 

E (L.  ) = E..  (L.  ) + E E 11  C , q.  , cos  4, 

ji  Ji  Ji  J*  h=4  k aLj2  * k hk  2 k 


(113) 


Substituting  equation  (113)  along  with  equations  (108)  and  the  second 
of  equations  (95)  into  the  second  of  equations  (27)  then  yields 


lt  = f.  - £ DV  6.  sin  § - £ E (L.  ) sin  0 


jo  J 2 


2 k i Ji  J2 


5 8E..(L.  ) 

- S E E ^ ^ C q cos  <)>  sin'0 

3L.  2 k hk  2 k ix 

i h=4  k J 2 


(114) 


From  equation  (110)  neglecting  terms  of  order  and  higher, 

1 5 

sin  0 = sin  0 - — 2 £ D p..  sin  ( 4>  + 0 .) 

li  2i  ^ jk  J1  ( 2 k 21 

J “*4  K 

+ sin  (<)>  - 9 .)  I (115) 

2 K 2 i_J 

Substitution  of  equation  (115)  into  (114)  and  neglecting  terms  of  order 


e 1 e 2 and  higher  gives 

f = f . -ED.  6.  sin  <|>  - E E..(L.  ).  sin  0 


J°  J2 


k i ji  }2 


(116) 


Replacing  the  L and  f terms  in  equations  (112)  and  (116)  by  their 

F B j2  j2 

constant  values  then  gives  the  relations  between  the  original  variables, 


the  constants  of  integration  and  time. 

L, _ = a;  + ^A  .(ajp..  cos  6 .0>ht+ch)  + 2 k(ah)<ljk&j  cos  + Jbht+Ch^ 


jo  j i zi  h ji 


'jo  'W  ¥Ejitah’  8in  0!ilbhttch)'5Djktah)6j  8in+  k,bht+ch’ 

(117) 

(j  = 1 , ,5)  (h  = 1 , ,5) 
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whe  re 


6. 

J 


0 j = 1,2,  3 

1 j =4,5 


and 

t>i  = P;  b2  = 0;  b3  = a 

b4  and  b5  are  given  in  equation  (101)  and  the  constants  ai  , . . . , a5  , 
ci  , . . . , c5  are  the  constants  of  integration  obtained  previously. 
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CONCLUSIONS 


The  analytical  procedure  for  obtaining  a first  order  approximate 
solution  to  equations  evolving  from  equations  representing  a general 
minimum  fuel  low  thrust  problem  has  been  presented.  The  actual 
evaluation  of  the  constants  of  integration  depends  of  course  on  the 
nature  of  each  particular  problem. 

It  may  be  expected  that  this  procedure,  especially  in  the  first 
order  format  presented  here,  will  be  more  applicable  to  situations  in 
which  the  vehicle  makes  many  orbits  around  a central  body  to  attain 
orbital  transfer  or  to  rendezvous  with  or  intercept  another  orbital 
vehicle.  Calculations  of  interplanetary  transfer  with  this  procedure 
will  probably  require  higher  order  approximations. 

The  determination  of  higher  order  approximations  with  this  pro- 
cedure is  straight  forward  through  the  first  transform.  This  requires 
the  simple  (though  tedious)  expansion  of  the  Fj  disturbing  function  to 
higher  powers  of  the  eccentricity;  the  inclusion  of  higher  order  terms 
in  the  expansions  of  the  determining  function,  Fi  * and  all  Taylor's 
series;  and  the  performance  of  the  extra  steps  required  to  determine 
the  higher  order  terms  of  the  determining  function.  The  extensions 
required  in  obtaining  a canonical  form  for  the  second  transform  are 
not  obvious.  However,  it  might  be  noted  that  the  need  for  the  higher 


order  solution  will  most  likely  result  when  the  conical  eccentricities 
are  expected  to  be  large.  A look  at  the  term  €2  which  multiplies  the 
entire  F2  term  indicates  that  it  is  a function  of  the  inverse  cube  of 
the  semi-major  axis.  Hence,  as  the  eccentricity  increases  €2  de- 
creases at  a much  faster  rate  and  the  neglection  of  the  entire  F2  term 
may  be  feasible. 

The  next  step  in  the  development  of  this  procedure  will  be  the 
attainment  of  numerical  results  for  a physical  problem.  The  weakest 
point  of  the  method  so  far  (other  than  the  order  limitation)appears  to  be 
the  iteration  procedure  that  is  required  to  obtain  a good  value  for  the 
P term  in  obtaining  the  canonical  formulation  for  the  second  transform. 
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The  generalized  Newton -Raph son  method  is  used  to 
determine  optimum,  coplanar,  circle-to-circle,  transfer 
trajectories  for  low  thrust  space  vehicles  operating  in 
a strong  central  force  field,  such  as  a near  earth  orbit. 
Optimum  thrust  steering  programs  are  computed  for  progres- 
sively increasing  values  of  final  time  up  to  durations 
involving  26  revolutions  about  the  earth.  A description 
of  the  numerical  results  and  a comparison  of  these  with 
the  results  of  a previous  linear  analysis  are  given. 


INTRODUCTION 


This  paper  is  concerned  with  the  computation  of  opti- 
mum, orbital  transfer  trajectories  for  space  vehicles  with 
low  thrust,  electrical  propulsion  systems  operating  in  a 
strong  central  force  field,  such  as  near-earth  orbits.  Al- 
though the  magnitude  of  thrust  acceleration  for  interplane- 
tary and  geocentric  low  thrust  missions  can  be  similar,  the 
optimum  trajectories  for  the  two  missions  are  quite  differ- 
ent. This  is  due  to  the  predominant  gravitational  attrac- 
tion of  the  earth,  which  at  an  altitude  of  200  miles  is  more 
than  1500  times  greater  than  the  gravitational  attraction  of 
the  sun  at  a distance  of  one  astronomical  unit.  Thus,  many 
orbital  circuits  are  required  for  a low  thrust  vehicle  to 
complete  various  geocentric  missions. 

Many  of  the  problems  associated  with  optimization  of 
geocentric  low  thrust  trajectories  stem  from  the  large 
number  of  revolutions  about  the  earth  required  of  the  vehi- 
cle. One  of  these  problems  is  the  sizable  accumulation  of 
round-off  and  truncation  error  resulting  from  the  many  inte- 
gration intervals.  A second  difficulty,  associated  with 
some  of  the  successive  approximation  techniques,  is  the  need 
to  store  the  control  variables  as  functions  of  time.  If  the 
functions  are  rapidly  changing  ones,  the  amount  of  computer 
storage  required  may  become  prohibitive.  A third  difficulty, 
usually  associated  with  the  classical  indirect  methods  for 
solving  the  boundary  value  problem,  is  the  extreme  sensi- 
tivity of  terminal  conditions  to  initial  conditions  of  the 
multipliers.  As  the  number  of  revolutions  for  an  optimum 
trajectory  increases,  the  sensitivity  may  be  intensified  to 
a point  where  systematic  computer  procedures  will  not  con- 
verge to  the  desired  solution. 

Several  successive  approximation  techniques,  each  em- 
ploying a variation-of-parameters  integration  procedure, 
have  been  developed  (Refs . 1 and  2)  and  programmed  for 
IBM  7090  computation.  'Although  these  methods  have  proven 
partially  successful,  satisfactory  convergence  to  solutions 
of  the  multiple  pass  problem  have  not  been  achieved.  As  an 
alternate  approach  to  this  problem,  the  generalized  Newton- 
Raphson  method  (Refs.  3 and  4)  has  been  used  with  consider- 
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able  success.  The  algorithm  for  this  method  solves  a se- 
quence of  linear  boundary  value  problems  such  that  the 
sequence  of  solutions  converges  to  the  solution  of  the  non- 
linear problem.  Because  the  linear  boundary  value  problem 
is  easily  handled  numerically,  the  algorithm  is  readily 
adaptable  to  high  speed,  digital  computation.  Another  ad- 
vantage is  that  the  initial  approximations  do  not  have  to 
satisfy  the  differential  equations  or  the  boundary  condi- 
tions. Thus,  simple  starting  functions,  such  as  straight 
lines  or  unperturbed  two-body  orbits,  are  usually  adequate 
for  convergence  to  the  desired  solution. 

The  specific  problem  treated  in  this  paper  is  that  of 
determining  the  optimum  thrust  steering  program  that  will 
minimize  the  time  to  transfer  between  coplanar,  circular 
orbits.  Since  the  thrust  magnitude  is  fixed,  minimum  time 
is  equivalent  to  minimum  fuel. 


SYSTEM  MODEL 


For  the  system  model,  only  coplanar  motion  in  a geocen- 
tric inverse- square  gravity  field  is  considered.  The  vehi- 
cle is  taken  as  a mass  particle  with  a thrust  vector  con- 
stant in  magnitude  and  variable  in  direction.  The  problem 
is  to  determine  the  optimum  thrust  steering  program  for 
minimum  time  transfer  from  a circular  orbit  at  an  altitude 
of  200  statute  miles  to  a higher  energy  circular  orbit. 
Because  the  vehicle’s  mass  decreases  linearly  with  time, 
minimizing  time  is  equivalent  to  minimizing  fuel. 

The  equations  of  motion  in  polar  coordinates  are 

• k T sin  G 

u-r  2 m + mt  * 

r o 

• _ _ uv  T COS  6 

v - r m + mt  / 
o 


r 


u 


Here,  k is  the  gravitational  parameter  of  the  earth;  T 
is  the  thrust;  6 is  the  thrust  steering  angle;  mQ  is  the 
initial  mass;  and  m is  the  time  rate  of  change  of  the 
vehicle's  mass.  The  state  variables  are  defined  in  Fig.  1. 


350 


The  basic  numerical  data  used  for  most  of  the  computations 
are 

k = 1.408  x 1016  ft^/sec^  , 

r = 2.19825  x 107  ft  , 
o 

g =32.2  ft/sec^  , 

T = 10  lb  , 

W = 10,000  lb  , 
o'  * 

I = 5000  sec  , 
s ’ 

mo  = Wo/g  = 310.559  slug  , 

m = -T/I  g = -6.21118  x 10~5  slug/sec  , 
s 

u = 0 
o 

v = (k/r  ) 2 
o o' 

where  g is  the  gravitational  acceleration  at  the  surface 
of  the  earth,  Wc  is  the  vehicle's  initial  weight,  and  Is 

is  the  specific  impulse 
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VARIATIONAL  TREATMENT 


The  results  of  this  paper  were  obtained  by  the  indirect 
method  of  the  calculus  of  variations  in  conjunction  with'  the 
generalized  Newton-Raphson  algorithm  (Ref.  4) . The  exis- 
tence of  a solution  to  the  nonlinear  optimal  control  problem 
is  assumed  and  the  necessary  conditions  are  obtained  by  the 
application  of  the  Pontryagin  maximum  principle  (Ref.  5). 

For  the  problem  treated  herein,  the  necessary  conditions  may 
also  be  obtained  by  classical  procedures  (Refs.  6 and  7). 
These  necessary  conditions  form  a nonlinear,  two-point, 
boundary  value  problem.  For  the  problem  of  this  paper,  the 
relevant  boundary  value  problem  is  given  by  the  following 
sixth  order  system: 


r = u 


f(1)  , 


. v2  k + a«Au 

U = - — - + 

r r2  /.  2 


1_ 

2s  2 


f(2)  , 


V = 


- — + 
r 


a (t)  A 


V 


2\  2 


f(3) 

3 


A"  + 
u v) 


A = 


v_  _ 2k 


A - A 

u 2 v 
r 


f(4)  . 


A = - A + - A 

u r r v 


= f(5) 


A = 

v 


- a + a A 
r u r v 


= f(6) 
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where 


a(t)  = 


m + mt  ’ 
o 


and  the  boundary  conditions  are 

at  t = t^  (t^  unspecified) 

:f 


at  t = 0 
o 


r(0)  = rQ  , 


r(tf)  = , 


u(0)  = uq  , 
v(0)  = vq  , 


This  may  be  written  as 


U(tf)  = uf 
v(tf)  = vf 


X = F(X,  t)  , 


where 


X = (x« 

9 

F - (£<» 

....  f(6>) 

9 

and 

(t)  - r(t) 

, x<2)(C)  = 

u(t)  , 

x(3)  (t)  - v(t) 

x(4)(t)  = Ar(t) 

A (t)  , 

U ' * * 

x(6)(t)  = Av(t) 

The  generalized  Newton-Raphson  algorithm  proceeds  by 
solving  the  following  sequence  of  linear,  two-point,  bound- 
ary value  problems: 


X 


n+1  “ J<V  « 


- x"(t) 


n 


+ F(Xn,  t)  , 


n = 0,  1,  2,  . . ., 
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where  J(X,  t)  is  the  Jacobian  matrix  of  partial  deriva- 
tives of  the  f^  with  respect  to  the  x^,  i,  j=l,...,6. 
The  boundary  conditions  for  every  n are  those  given  above. 
A starting  vector,  XQ(t),  and  an  estimated  final  time, 

tf  , are  assumed  and  the  sequence  of  linear  boundary  value 
ro 

problems  is  solved  numerically  by  the  method  described  in 
detail  in  Ref.  4. 

The  basic  starting  vector  XQ(t)  is  of  the  following 
simple  form: 

rf  - r 

x(1)  (t)  = r (t)  = r + -*r 2 t , 

o o o t^ 


X<2)(t)  - U0(t)  .3  0, 


xo3)(t)  ■ vo(t)  - 


JL 
I 2 


rQ(t) 


(4) 


x ' ' (t)  = A (t)  = 1 , 

° o 


x(5)  (t)  = (t) 


(t)  * * (t)  s 

o 


c,  for  t € (0,  i tf  \ 

c2  for  t € (i  tf°,  t£J 

c^  for  t e ^0,  i ^ 

c4  for  t e tf  , tf  ^ 

' A 
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Most  of  the  results  described  in  this  report  were  ob- 
tained by  first  producing  a solution  using  the  above  simple 
starting  vector,  XQ(t) • Then  a parametric  study  was  per- 
formed by  varying  the  relevant  parameters  (T,  m,  r^,  etc.) 
and  employing  the  solution  for  the  previous  set  of  parame- 
ters as  the  starting  vector  for  the  succeeding  set. 

For  transfers  that  required  more  than  approximately 
two-thirds  of  a revolution,  the  constants  c^,  c^}  c^,  and 
c^  above,  were  chosen  to  correspond  to  constant  circumfer- 
ential thrust.  For  shorter  term  transfers,  these  constants 
were  chosen  to  correspond  to  an  initial  thrust  program  that 
is  outward  along  the  local  vertical  for  the  first  half  of 
the  transit  time,  and  inward  along  the  local  vertical  for 
the  remaining  half.  For  a few  transfers,  which  required 
many  revolutions,  the  solution  to  the  nonlinear  state  equa- 
tions corresponding  to  constant  circumferential  thrust  was 
used  for  the  starting  vector,  XQ(t) . For  the  many  revolu- 
tion transfers,  this  choice  of  starting  function  appears 
more  efficient  than  the  simplified  starting  functions  given 
above,  even  though  it  does  not  meet  the  boundary  conditions 

For  purposes  of  obtaining  transfers  that  have  certain 
particular  properties  it  was  found  convenient  to  treat  the 
original  time  optimal  problem  as  a fixed  time,  maximum 
radius  problem.  This  introduces  a boundary  condition  of  a 
more  general  class  than  previously  handled  within  the  frame 
work  of  the  generalized  Newton-Raphson  algorithm.  The 
boundary  condition  appears  as  a nonlinear  functional  rela- 
tion between  the  final  value  of  the  local  horizontal  ve- 
locity, v(t^),  and  the  final  radial  distance,  r(t^),  viz 


4>  (r(tf),  v^f))  = " k r(fcf) 


- 0 


The  procedure  used  for  these  cases  was  as  follows:  An 

approximate  value  for  v(tp  was  changed  automatically  at 
each  step  of  the  iteration,  on  Xn,  by  means  of  the  recur 
sion  formula 
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r,  , -r 
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2 , 

rn+l^f^ 

1 

“ 2 

[krn(tf)  ] 

3 " r (t,) 

nN  £'  J 

This  formula  results  from  the  Newton-Raphson  sequence  for 
the  scalar  valued  mapping  0,  with  an  initial  estimate 

r0(tf),  vo(tf)j.  As  n ->  oo,  xn(t)  ->  X*(t),  rn(tf)  ->  r* 

v (t  ) ->  v , where  X (t)  is  the  solution  of  the  nonlinear 
n f ■}'? 

differential  equations  and  (r  , v ) is  the  solution  of  the 
boundary  relation  $ = 0.  This  procedure  was  entirely  sys- 
tematic  and  exhibited  good  convergence  properties  over  the 
range  of  problems  studied  herein. 


COMPUTATIONAL  RESULTS 

Computer  programs  utilizing  the  generalized  Newton- 
Raphson  method  have  been  developed  to  optimize  circle-to- 
circle  transfers  both  for  minimum  time  problems  with  speci- 
fied values  of  final  radius  and  for  maximum  radius  problems 
with  specified  values  of  final  time.  The  minimum  time  pro- 
gram was  used  to  generate  solutions  for  progressively  in- 
creasing values  of  final  radius  up  to  durations  involving 
21.3  revolutions  about  the  earth.  The  basic  numerical  data, 
given  on  the  preceding  pages,  were  used  for  this  series  of 
computations . 
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For  values  of  final  time  up  to  a few  orbital  periods, 
the  results  are  quite  similar  to  those  obtained  from  a pre- 
vious near-circular  linear  analysis  (Ref.  8).  Figure  2 
shows  the  optimum  thrust  steering  programs  for  very  short 
durations,  up  to  one  orbital  period.  Although  the  solutions 
shown  are  taken  from  the  linear  analysis  of  Ref.  8,  the  dif- 
ferences between  these  and  the  latest  nonlinear  results  are 
at  most  2°  for  the  one  revolution  case.  The  time  scale  for 
each  solution  has  been  normalized  so  that  a comparison  may 
be  made  on  a common  scale  for  which  the  normalized  time 
varies  from  zero  to  one.  It  is  noted  that  the  time  varia- 
tion of  the  thrust  steering  angle,  0,  is  antisymmetrical 
with  respect  to  the  midpoint.  For  the  very  short  durations, 
1/6-  to  1/2-revolution,  the  0 motion  has  a mean  of 
0 = 180°  (opposite  in  direction  to  circumferential  thrust), 
whereas  the  corresponding  motion  for  durations  of  2/3-revo- 
lution and  longer  takes  place  about  a mean  of  0=0  (cir- 
cumferential thrust).  Also  shown  in  Fig.  2 is  the  thrust 
steering  angle  for  one  revolution.  For  this  case  0 is 
very  nearly  circumferential. 

In  Fig.  3 the  0 scale  is  considerably  enlarged  and  a 
comparison  is  made  between  the  linear  and  nonlinear  solu- 
tions for  the  2^-revolution  example.  The  difference  is 
still  very  small,  about  ±3°.  However,  it  is  noted  that  the 
duration  of  the  last  period  of  the  nonlinear  solution  is 
slightly  longer  than  that  of  the  linear  solution,  whereas, 
the  first  periods  of  the  two  0 time  histories  are  almost 
identical  in  length.  This  characteristic  is  more  apparent 
for  the  longer  duration  transfers,  and  is  due  to  the  thrust 
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FIG.  2 NORMALIZED  OPTIMUM  THRUST 

STEERING  ANGLE  FOR  TRANSFER 
TIMES  UP  TO  ONE  ORBITAL  . PERIOD 


FIG.  3 OPTIMUM  THRUST  STEERING  ANGLE 
FOR  2.5  REVOLUTIONS 
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steering  angle  0 always  being  in  phase  with  the  vehicle's 
orbital  angle,  <p  (see  Fig.  1),  i.e.,  as  the  altitude  in- 
creases the  orbital  period,  and  therefore  the  period  of  0 
motion,  also  increases. 

Figure  4 is  also  taken  from  the  linear  analysis  of 
Ref.  8 because  the  differences  are  still  relatively  small. 
For  N = l\,  2^,  and  3^,  the  amplitude  of  motion  is  de- 
creasing and  approaching  a circumferential  thrust  program. 
Also,  for  N ■ 1,  2,  and  3,  the  thrust  program  of  the 
linear  analysis  is  exactly  circumferential,  and  only  nearly 
circumferential  for  the  nonlinear  results  of  this  report.  A 
search  was  made  for  a 0(t)  = 0 program  for  a series  of 
solutions  from  13^  to  14^  revolutions.  It  is  clear  from 
the  results  of  this  search  that  an  exactly  circumferential 
thrust  program  does  not  exist,  the  closest  being  a minimum 
amplitude  of  3.2°.  It  has  been  proven  independently  by 
H.  J.  Kelley  and  R.  McGill  that  0 (t)  =0  does  not  satisfy 
the  Euler-Lagrange  equations,  except  in  the  limiting  case 
when  the  thrust  acceleration,  T/m,  vanishes. 

A typical  optimum  thrust  steering  program  for  19  revo- 
lutions is  shown  in  Fig.  5.  As  previously  mentioned,  the  0 
motion  throughout  the  flight  is  in  phase  with  the  orbital  of 
motion.  Also  characteristic  is  the  relatively  large  0 
motion  at  the  beginning  of  the  transfer  that  diminishes  to  a 
3^°  to  amplitude  near  the  end  of  the  maneuver.  A 

check  of  the  time  history  of  eccentricity  reveals  that  the 
maximum  values  of  eccentricity  build  up  from  .0045  to 
.0095,  and  that  the  minimum  values  are  very  small  (less 
than  .0004)  but  never  exactly  zero  except  at  the  two  termi- 
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FIG.  4 OPTIMUM  THRUST  STEERING  ANGLE  FOR 
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REVOLUTIONS 


nals  of  the  transfer.  Using  the  generalized  Newton-Raphson 
method,  this  particular  solution  required  computation  of  24 
trajectories,  5 iteration  cycles,  425  constant  integration 
intervals  per  trajectory,  an  average  of  22  intervals  per 
orbit  (17  intervals  for  the  first  orbit) , and  a total  of  49 
seconds  of  IBM  7094  computer  time. 

The  solutions  obtained  are,  of  course,  locally  optimum, 
and  no  attempt  has  been  made  to  search  for  a different  class 
of  optimum  trajectories  that  may  yield  better  performance. 
Should  such  a class  of  solutions  exist,  they  would  most 
likely  be  revealed  for  the  significantly  longer  duration 
maneuvers . 

Because  the  equations  of  motion  of  the  linear  analysis 

2 

contain  only  the  single  parameter  T/moci>o,  it  is  possible 

to  plot  a "miles -per-gallon"  nondimens ional  parameter, 

2 

Ar(wo/27rN^(T/mo) , as  a function  of  the  number  of  revolu- 
tions, N^,  required  to  complete  the  transfer.  This  gen- 
eral type  of  plot,  taken  from  Ref.  8,  is  presented  in  Fig.  6. 
Given  the  thrust/mass  ratio  of  the  vehicle  and  the  frequency 
of  the  original  orbit,  the  increase  in  radius  of  the  circu- 
lar orbit  may  easily  be  computed  as  a function  of  the  number 
of  revolutions. 

Similar  performance  results,  obtained  with  the  gener- 
alized Newton-Raphson  method,  are  shown  in  Fig.  7 and  do  not 
significantly  differ  from  the  linear  results.  The  improved 
performance  is  due  to  the  more  realistic  mathematical  model 
of  the  nonlinear  analysis  that  takes  into  account  the  de- 
crease in  gravitational  attraction  and  reduction  in  vehicle 
mass  as  the  duration  of  the  maneuver  increases. 
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FIG.  7 NONDIMENSIONAL  ALTITUDE  GAIN  PARAMETER 
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All  of  the  previously  discussed  numerical  results  apply 
to  a vehicle  with  T/W^  = .001  g's  and  I = 5000  seconds. 

A brief  vehicle  parameter  variation  was  carried  out  and  is 
summarized  in  the  following  table  for  a fixed  value  of  final 
time  equal  to  40.29  hours.  The  final  time  of  40.29  hours 
was  selected  because  it  corresponds  to  a transfer  of  20 
revolutions  using  the  basic  numerical  data. 


H 

s; 

0 

I 

s 

AR 

Nf 

(g's) 

(sec) 

(miles) 

(rev) 

.0025 

5000 

10,658. 

13.064 

.001 

5000 

2,134. 

20.046 

.0005 

5000 

893.6 

23.123 

.00025 

5000 

413.9 

24.792 

.0001 

5000 

158.2 

25.852 

.001 

1000 

2,323. 

19.824 

Although  there  was  no  difficulty  in  computing  an  opti" 
mum  transfer  consisting  of  21.3  revolutions,  it  was  not  pos~ 
sible  to  achieve  convergence  to  an  accuracy  of  four  signifi- 
cant figures  for  a transfer  involving  21.5  revolutions. 

This  appears  to  be  the  limit  for  the  generalized  Newton- 
Raphson  method  employing  ordinary  polar  coordinates  and  a 
simple  second  order,  modified  Adams,  predictor-corrector, 
numerical  integration  procedure.  The  difficulty  appears  to 
be  associated  with  the  size  and  number  of  integration  in- 
tervals rather  than  the  number  of  revolutions,  because  there 
was  no  difficulty  in  computing  a 26 -revolution  transfer  with 
a thrust  acceleration  of  .0001  g's. 
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APPROXIMATE  ANALYTICAL  SOLUTIONS 


In  Ref.  8,  optimum,  low  thrust  transfers  between  neigh- 
boring circular  orbits  were  determined  for  vehicles  with 
constant  thrust  acceleration.  It  was  shown  that  if  the 
deviations  from  an  original  circular  orbit  are  small,  the 
equations  may  be  linearized,  and  the  resulting  optimal  solu- 
tions are  globally  minimizing.  Furthermore,  whenever  the 
duration  of  powered  flight  is  some  integral  multiple  of  the 
orbital  period,  the  optimum  thrust  direction  is  circumferen- 
tial, and  the  vehicle  passes  through  a higher  energy  circu- 
lar orbit  condition  at  the  end  of  each  revolution. 


The  numerical  results  of  the  linear  analysis  (Ref.  8) 
show  that  for  integral  number  of  revolutions 
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yf^o 

Tf(T/mQ) 
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(1) 


where  y^  = Ar  = r^  - r^  is  the  gain  in  altitude,  coq  is 
the  initial  orbital  frequency,  = o^t^  is  the  nondimen- 

sional  value  of  final  time,  and  T/mQ  is  the  constant 
thrust  acceleration.  For  constant  orbital  frequency,  the 
number  of  revolutions,  N^,  is  by  definition 


CO  tr- 

o f 
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(2) 


Equation  (1)  may  be  rewritten  as 

2 

Arco 

= 2 (3' 

27rNf(T/mo)  1 * ^ 

which  is  the  altitude  gain  parameter  plotted  in  Figs.  6 and 

7. 


367 


Equation  (3)  may  also  be  derived  from  simple  energy 
concepts,  assuming  that  the  thrust  program  is  circumferen- 
tial. The  final  energy  of  the  vehicle  is  expressed  as  the 
sum  of  the  initial  energy  plus  the  work  done  by  the  rocket 
(work  is  thrust  multiplied  by  the  distance  traveled,  27rroNg) 
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Because  the  initial  and  final  orbits  are  circular  (Vf 

2 r 
and  V = k/r  ).  Eq.  (4)  reduces  to 
o o 
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Also,  for  neighboring  circular  orbits. 


CD 

O 


which  reduces  Eq.  (5)  to  (3)  . 

As  a measure  of  performance,  the  following  two  equa- 
tions, obtained  from  Eqs.  (2)  and  (3),  indicate  the  number 
of  revolutions  and  time  it  takes  for  a given  vehicle  to 
transfer  between  circular  orbits  with  specified  radii: 


co^(rc  - r ) 
M _ o f o' 

r 47r(T/m  ) 
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Comparison  of  the  numerical  results,  based  on  the  non- 
linear mathematical  model,  with  those  obtained  from  Eqs.  (6) 
and  (7),  shows  that  there  is  good  agreement  for  transfers 
involving  one  or  two  revolutions.  Thereafter,  the  differ- 
ence between  linear  and  nonlinear  results  progressively  in- 
creases (see  Fig.  7).  For  the  19-revolution  example,  the 
errors  in  the  above  linear  equations  are  77  per  cent  for 
and  36  per  cent  for  t^.  This  is  due  to  the  assumption  in 
the  linear  analysis  that  the  gravitational  attraction  and 
mass  of  the  vehicle  are  constant. 


If,  however,  coq  and  mQ  in  Eqs.  (6)  and  (7)  are  con- 
tinuously rectified,  the  new  expressions  should  be  in  closer 
agreement  with  the  nonlinear  results.  In  the  following 
derivation,  N and  t are  considered  as  functions  of  r: 
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For  r substantially  greater  than  r , 

2 3 ° 

is  replaced  by  oo  = k/r  ; 


the  quantity  cdq 
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and  integration  carried  out  with  respect  to  r3 


Nf  8Tr(T/mo) 


(10) 


i 1 1 


For  the  19-revolution  example,  the  errors  with  respect  to 
the  computed  nonlinear  results  are  reduced  to  1.06  per  cent 
for  and  1.23  per  cent  for  tf. 

Because  mass  in  the  integrals  of  Eqs.  (8)  and  (9)  is 
treated  as  a constant,  a further  improvement  is  possible  by 
utilizing  Eq.  (11)  and  expressing  mass  as  a function  of  r: 


m = m + mt 
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f,  . m 

“of  + T 


/k_\2  . I 
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(12) 


If  this  expression  is  substituted  in  Eqs.  (8)  and  (9)  and 
integration  is  carried  out  again,  then 
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For  the  19-revolution  example,  the  errors  are  further  re- 
duced to  0.20  per  cent  for  Nf  and  0.15  per  cent  for  tf. 

Because  Eq.  (14)  is  an  improved  expression  for  t as  a 
function  r,  it  is  possible  to  repeat  integration,  again 
and  again  if  necessary,  in  an  attempt  to  reduce  further  the 
errors.  This  has  not  been  carried  out  as  the  accuracy  of 
the  nonlinear  results  is  not  better  than  four  significant 
figures. 
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Summary 

The  generalized  Newton-Raphson  method,  an  iterative 
procedure  for  solving  nonlinear  operator  equations,  has 
been  extended  in  application  to  variational  problems  with 
bounded  control  variables.  A minimum  fuel  interplanetary 
low  thrust  orbital  transfer  problem  is  worked  out  in  detail 
to  demonstrate  the  practical  aspects  of  the  algorithm  as 
well  as  its  computational  effectiveness.  The  control  vari- 
ables are  the  thrust  magnitude,  limited  from  zero  to  some 
prescribed  maximum  value,  and  the  thrust  steering  angle. j 
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INTRODUCTION 


v 


For  variational  problems,  not  involving  inequality  con- 
straints on  state  or  control  variables,  the  state  equations 
and  Euler -Lagrange  equations  generally  consist  of  a system 
of  nonlinear  differential  equations  with  two-point  boundary 
conditions.  For  such  a system,  the  generalized  Newton- 
Raphson  technique  proceeds  by  solving  a sequence  of  linear 
boundary  value  problems  in  such  a manner  that  the  sequence 
of  solutions  converges  to  the  solution  of  the  nonlinear 
boundary  value  problem.  The  generalized  Newton-Raphson  op- 
erator technique  has  been  developed  for  such  systems  of  or- 
dinary differential  equations  with  two-point  boundary  condi- 
tions (Ref.  1)  and  successfully  applied  to  various  uncon- 
strained variational  problems  (Ref.  2). 

In  this  paper,  we  consider  variational  problems  with 
inequality  constraints  on  at  least  one  control  variable. 
Following  Valentine  (Ref.  3),  a new  variable  is  introduced 
such  that  the  inequality  constraint  may  be  replaced  by  an 
equivalent  equality  constraint.  The  resulting  nonlinear 
system  of  state  and  Euler-Lagrange  equations  now  consists  of 
differential  equations  and  algebraic  equations.  The  gener- 
alized Newton-Raphson  method  is  applied  to  this  nonlinear 


operator  equation.  Again,  this  is  accomplished  by  solving 
a sequence  of  linear  operator  equations  such  that  the  se- 
quence of  solutions  converges  to  the  solution  of  the  non- 
linear operator  equation. 

The  algorithm  is  applied  to  the  computation  of  minimum 
fuel,  low-thrust.  Earth  to  Mars  orbit  transfer  trajectories, 
with  bounded  thrust  magnitude. 
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PROBLEM  FORMULATION 


Given  the  differential  constraints 


— x^,  •••>  xn'>  U"]_>  •••>  um)  ® 

i = 1 * 2 , •••#  n > 

and  at  most  2n+l  end  conditions  involving  t and  x^,  as 
well  as  inequality  constraints 


UK  < UK  ^ UK  J K = 1,  . . . , r < m , 

min  max 

the  problem  is  to  determine  the  state  variables  x^(t)  and 
control  variables  u.. (t)  so  as  to  minimize  the  function 

P = t^,  x-^(tQ),  •••»  xn(tQ) , xx(tf),  xn(^f)^  • 

A set  of  new  real  variables  is  introduced 

(Refs.  3-6),  and  the  inequality  constraints  on  the  control 
variables  are  replaced  by 
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(3) 
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^here  the  Mt)  are  undetermined  multipliers.  From  a modi- 
fication of  the  classical  calculus  of  variations  (Refs.  4-8), 
we  obtain  as  necessary  conditions  for  the  existence  of  a 
local  minimum  of  P: 

(a)  the  Euler-Lagrange  equations 


_d_  dF dF 

dt  dx^  dx^ 


dF 

du 


0 


i = 1,  2,  . . .,  n 
j = 1,  2,  . . m 


(4) 


dF 

da 


= 0 


K 


K = 1,  2,  ...,  r , 


(b)  the  transversal ity  conditions 


dP  + 


n 


- y 


*i  H: ldt 


- 0 , 


i=l 


'0 


where  the  dt  and  dx±  are  differentials  which  are 
connected  by  the  prescribed  end  conditions, 

(c)  the  Weierstrass  condition,  which  for  this  problem  is 
equivalent  (Ref.  4)  to  the  requirement  that 


n 

H=  ^ 7^f^(t,  xL,  ...,  xn,  u^,  ...,  um) 

i=l 

be  maximum  with  respect  to  the  control  variables  u^ 
satisfying  the  imposed  inequality  constraints. 
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To  obtain  the  solution  of  the  problem  stated  above,  the 
generalized  Newton-Raphson  algorithm  is  applied  to  the  oper- 
ator equation  consisting  of  Eqs.  (1),  (2),  and  (4).  This 
operator  equation  consists  of  a two-point  boundary  value 
system  of  order  2n,  in  addition  to  a system  of  scalar  equa- 
tions of  order  m + 2r.  The  following  numerical  example 
should  clarify  the  computational  procedure. 

LOW- THRUST  ORBITAL  TRANSFER  EXAMPLE  — MINIMUM  FUEL 

The  problem  we  wish  to  consider  is  closely  related  to 
the  last  example  in  Ref.  2.  Kelley  et  al.  (Ref.  9)  have  ob- 
tained results  to  the  minimum  time  version  of  the  problem 
via  gradient  techniques . We  wish  to  minimize  the  fuel  con- 
sumption of  a low-thrust  ion  rocket  which  is  to  transfer 
from  the  orbit  of  Earth  to  the  orbit  of  Mars,  in  fixed  time. 
The  orbits  of  Earth  and  Mars  are  assumed  to  be  circular  and 
coplanar,  and  the  gravitational  attractions  of  the  two 
planets  are  neglected.  The  system  parameters  are:  the  ini- 

tial mass  mg,  46.58  slugs;  the  constant  equivalent  exit 
velocity,  c = 1.831  x 10^  ft/sec;  and  the  propellant  mass 
flow,  p,  which  is  required  to  remain  within  the  bounds 

P = 6.937  xlO  7 slugs/sec,  and  P . = 0. 

max  min 
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We  now  proceed  to  the  formal  statement  of  the  problem. 
Given  the  differential  constraints: 


r = w 


w 


sin  9 


m 

V = 


wv  cf3 

r m 


cos  9 


m - - p 

with  the  boundary  conditions: 


(5) 


r<V  = r0 

w(tQ)  - w0 
v(tQ)  - v0 
m(tQ)  = mQ 


t = t, 


r(tf)  = rf 
w(tf)  = wf 
v(t.)  = vp 


m(t^)  open  , 


where  w and  v are  the  radial  and  circumferential  veloci- 
ties respectively;  r is  the  radius;  and  9 is  the  thrust 
direction  angle  measured  from  the  local  horizontal.  In  ad- 
dition, given  the  inequality  constraints 


p . < p < p , 

min  - - max 


determine  the  state  variables  r(t),  w(t),  v(t) , m(t)  and 
control  variables  9(t)  and  p(t)  so  as  to  minimize 
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P = - m(tf)  . 


Rewriting  the  inequality  constraints  on  p as 


(P  - P . )(P  - p)  - a = 0 , 

min'  ' max  ' 


the  Euler -Lagrange  equations,  Eqs.  (4),  become 


A ■>  A - 2S\  . a ^ 

w V 2 3/  V 2 

\r  r r 


A = A — - A 
w v r r 


A = 

v 


- 2A  2 + A 2 
w r v r 


• c|3 

A = — -(A  sin  0 + A cos  0) 
m 2 w v 

m 


(6) 


cP 

0 = — (-A  cos  0 + A sin  0) 

m w v 


0 = — (A  sin  0 + A cos  0)-A-A(f3  + P . - 2P) 

m w v ' m a ' max  mm  ' 


0 = aA 

a 


Equations  (6)  and  the  Weierstrass  condition  imply 


sin  0 = 


cos  0 


A (A2  + A* 
W V w V/ 


A (A2+aV 

V \ W V/ 


(7) 
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Substitution  of  Eqs.  (7)  into  Eqs.  (5)  and  Eqs.  (6),  now 
yields  the  nonlinear  boundary  value  problem 


r = w 


= fJ 
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2 V w v/ 
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with  boundary  conditions 


t 

fco 

t 

r(tQ) 

= ro 

r(tf) 

= rf 

«(t0) 

= wo 

w(tf) 

" Wf 

v(tQ) 

= v0 

v(tf) 

= vf 

m(t0) 

= mQ 

A (tn)  - A The  constant  A scales  the 
r r0  multipliers.  0 
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In  addition  to  the  boundary  value  system  given  by 


Eqs.  (8),  we  have  to  satisfy  the  equations 

0 - (P  - P . )(P  " P)  - a2 

N mm7  x max  7 


o « £ / A2  + A'V  - A - A (P  + P . - 2P) 

m V w v/  m a'  max  mm 


l. 

,2n2 


0 = aA 


= g 


= g 


10 


(9) 


a 


* g 


11 


For  the  discussion  of  the  application  of  the  Newton- 
Raphson  operator  technique  to  the  nonlinear  system  consist- 
ing of  Eqs.  (8)  and  Eqs.  (9),  we  rewrite  these  equations  as 
follows: 


X = F(X,  t) 
G(X,  t)  = 0 


te[t0,  tf] 


(10) 


where 

X 


= (x1,  ...,  x11) 
F = (f\  ...,  f8) 


9 _10  _11^ 

i / 1 11 


G ^g  > g * g | 


f1  = fL  (x\  . X , t) 

1 - g1  (x1,  ...,  X11,  t) 


g 


i = 1,  ...,  8 
i = 9,  10,  11  , 
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and 


xl(t)  = 

r(t)  , 

x2(t)  = w ( t)  , 

x3(t)  = V (t)  , 

x4(t)  » 

m(t)  , 

x5(t)  = \(t)  , 

x6(t)  = Aw(t)  , 

x7(t)  = 

> 

x8(t)  = *m(t)  , 

x9(t)  - p(t)  , 

10,.* 

X (t)  * 

■ “(t)  , 

xu(t)  = Aa(t)  . 

The  algorithm  now  requires  the  solution  of  the  sequence  of 
linear  equations: 

Xn+1  " J<V  t)(Xn+l  ’ V + F(Xn’  ‘> 

0 - KXn,  t)  <Xn+1  - Xn)  + G(Xn>  t) 

n = 0,  1,  . . . , 

where  J(X,  t)  is  the  matrix  with  elements  J.  . = , 

J dxJ 

i - 1,  . 8,  j - 1,  . 11;  and  I(X,  t)  is  the  matrix 

i 

with  elements  I..  = — r , i = 9,  10,  11,  j = 1,  11. 

1J  dxJ 

9 9/1  8\  . , 

At  every  iterate  n,  x^  = xn  (xn^  •••>  xnJ  xs  ob_ 

tained  from  Eq.  (lib) . This  relation  is  used  to  eliminate 

Q 18 

x from  Eq.  (11a).  The  functions  x (t) , ...,  x (t)  are 
n n n 

9 10 

then  computed  from  Eq.  (11a),  after  which  x (t)  , xn  (t) , 
x^'L(t)  are  computed  from  Eq.  (lib).  A description  of  the 


(Ha) 

(lib) 
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method  of  solution  for  the  linear  two-point  boundary  value 
system,  Eq.  (11a),  with  the  given  end  conditions  is  con- 
tained in  Ref.  2.  The  iteration  proceeds  until 
p (xn+i*  Xn)  < where 


8 


P(X 


n+lJ 


X ) 
n' 


■ I 


i=l 


max 

t€[tQ, 


x 


h+1 


(t)  - 


xL(t) 


(12) 


and  6 is  a suitably  small  positive  constant.  The  corre- 
sponding iterate  is  accepted  as  a solution,  and  a 

final  check  is  made  by  integrating  the  nonlinear  Eqs.  (8) 
with  a complete  set  of  initial  conditions  taken  from  the 
final  iterate,  and  with  P(t)  computed  at  every  integration 
step  by 


where 


P 

max 

p . 

mm 


when  rj  > 0 

when  rj  < 0 , 


= - A 2 + 


m 


w 


A . 

m 


(13) 


Equation  (13)  results  from  the  Weierstrass  condition,  viz., 
maximizing  H with  respect  to  p. 


The  data  for  the  problem  are  normalized  to  obtain: 


o 

u 

= 1.000 

rf  = 1.525 

wo 

= 0.000 

wf  = 0.000 

vo 

= 1.000 

= 0.8098 

mo 

= 1.000 

(3  a - 0.07500 
max 

A 

ro 

= 1.000 

P . = 0.000 

min 

K 

= 1.000 

c = 1.872  . 

This  results  in  a time  unit  of  58.18  days.  The  final  time 
tf  is  chosen  to  be  3.816  units  (222.0  days).  The  starting 
vector  Xq( t)  is  chosen  as  follows: 

1 rf  " r0 

xJ(t)  ^ rQ(t)  - rQ  + — t 

Xp(t)  s Wq(c)  $ 0 

JL 

2 

xo<£>  3 vo(t)  ■ 7^7)  (14) 

x4(t)  S ^(t)  = 1 - ^ 

x?(t)  = A (t)  ee  1.000 
U r0 
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0.5200  , 


for 


x0(t)  s Aw0(t)  = { 

u -0.5000  , for 

0 . 3000  , for 

xo(t)  55  V0  = I 

U 0.000  , for 

£ \<c>  s 0 

o 

Q . v pmax  / - , 27rt\ 

x0(t>  3 P0(t)  = ~T  i1  + COS  tf  ) 

Q 

10,  v pmax  /,  , _t_\ 

x0  (t)  = a0(t)  = 10  (1  + t£) 

xi1(t)  3 A (t)  - 10  sin  r1  - 11  • 
0 a0  c£ 


te  [0,  itf] 
te(|tf,  tf] 
te[0,  £tf] 
te(^tf,  tf] 


(14) 
(Cont .) 


To  carry  out  the  necessary  computations , the  time  in- 
terval [tQ,  tf]  is  divided  into  200  equal  subintervals. 
After  the  switching  times  (tj  = 0)  have  been  located, 
within  the  accuracy  of  the  grid  size,  the  time  steps  in  the 
neighborhood  of  the  switching  points  are  further  subdivided 
into  10  equal  intervals,  and  the  iterations  continued  with 
this  refined  grid.  In  this  manner,  it  is  possible  to  locate 
the  switching  points,  or  points  of  discontinuity  of  the  con- 
trol P(t),  with  greater  precision. 
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The  sequence  (Xn)  converges  to  an  accuracy  of  4 sig- 
nificant figures  in  51  total  iterations.  The  total  computer 
time  (IBM  7094)  required  is  approximately  2 minutes.  Fig- 
ures 1 and  2 illustrate  the  convergence  for  the  control  var- 
iables  0 (t)  and  £(t)  respectively.  6 (t)  and  £ (t) 
result  from  the  final  check  of  the  nonlinear  state  and 
Euler -Lagrange  equations,  Eqs.  (8),  with  the  switching 
points  obtained  from  Eq.  (13) . 

With  the  same  starting  vector  X^(t),  Eqs.  (14),  tra- 
jectories have  also  been  computed  with  final  times 
tf  = 195.0,  201.0,  208.0,  215.0  days.  In  Fig.  3 the  final 
time  tf  is  plotted  against  the  ratio  of  final  mass  mf  to 
initial  mass  mQ. 

CONCLUDING  REMARKS 

With  the  integration  routine  utilized  for  these  sample 
problems,  the  solutions  seem  to  be  limited  to  an  accuracy 
of  4 significant  figures.  We  believe  that  through  the  use 
of  higher  precision  integration  schemes,  presently  under  in- 
vestigation at  Grumman,  more  accurate  results  can  be  ob- 


tained. 


ACKNOWLEDGMENT 


The  authors  wish  to  thank  Dr.  Henry  J.  Kelley  for  his 
crucial  suggestions  pertaining  to  this  work. 

Portions  of  the  theoretical  part  of  this  study  were  gen 
erated  in  connection  with  AFOSR  Contract  No.  AF49 (638) -1207 . 


REFERENCES 


1.  McGill,  R.,  and  Kenneth,  P.,  "A  Convergence  Theorem  on 
the  Iterative  Solution  of  Nonlinear  Two-Point  Boundary 
Value  Systems,"  presented  at  the  XIV^  IAF  Congress, 
Paris,  France,  September  1963. 

2.  McGill,  R.,  and  Kenneth,  P.,  "Solution  of  Variational 
Problems  by  Means  of  a Generalized  Newton-Raphson  Opera- 
tor," in  Progress  Report  No.  5 on  Studies  in  the  Fields 
of  Space  Flight  and  Guidance  Theory.  MSFC,  Huntsville, 
Alabama,  March  1964. 

3.  Valentine,  F.A.,  "The  Problem  of  Lagrange  with  Differen- 
tial Inequalities  as  Added  Side  Conditions,"  Disserta- 
tion, Department  of  Mathematics,  University  of  Chicago, 
Chicago,  Illinois,  1937. 

4.  Leitmann,  G.,  "Variational  Problems  with  Bounded  Control 
Variables,"  Chapter  5 of  Optimization  Techniques,  edited 
by  G.  Leitmann,  Academic  Press,  New  York,  1962. 

5.  Leitmann,  G.,  "An  Elementary  Derivation  of  the  Optimal 
Control  Conditions,"  Proceedings  of  the  XII*^  IAF  Con- 
gress. Washington,  D.C.,  1961. 

6.  Berkovitz,  L.D.,  "Variational  Methods  in  Problems  of 
Control  and  Programming,"  J.  Math.  Anal,  and  Appl.  3, 

145  (1961)  . 


389 


7.  Bliss,  G.A.,  Lectures  on  the  Calculus  of  Variations, 
University  of  Chicago  Press,  Chicago,  1946. 

8.  Breakwell,  J.V.,  "The  Optimization  of  Trajectories,"  J. 
Soc.  Ind.  AppI.  Math.  1_}  215  (1959)  . 

9.  Kelley,  H.J.,  Kopp,  R.E.,  and  Moyer,  H.G.,  "Successive 
Approximation  Techniques  for  Trajectory  Optimization, 
presented  at  the  IAS  Vehicle  Systems  Optimization  Sym- 
posium, Garden  City,  New  York,  November  1961. 


390 


APPROXIMATING  OPTIMAL  TRAJECTORIES: 


SELECTION  OF  SIGNIFICANT  ESTIMATION  VARIABLES 
IN  A LEAST  SQUARES  PROBLEM 


I.  E. 

J.  W.  WAIKER, 


FERLIN,  J.  H.  MACKAY, 

J.  J.  GOODE,  0.  B.  FRANCIS, 


JR. 


RICH  COMPUTER  CENTER- 
GEORGIA  INSTITUTE  OP  TECHNOLOGY 
ATLANTA,  GEORGIA 


_ * + 

H65  33063 


ABSTRACT 


In  this  report  a step-tip  procedure  for  the  selection  of  significant 
estimation  variables  in  a least  squares  problem  is  developed.  Application 
of  this  procedure  to  several  examples  is  made,  and  a computer  program 
in  ALGOL  58  compiler  language  for  the  Burroughs  220  computer  is  discussed. 


APPROXIMATING  OPTIMAL  TRAJECTORIES:  SELECTION  OF  SIGNIFICANT 

ESTIMATION  VARIABLES  IN  A LEAST  SQUARES  PROBLEM 

The  Astro  dynamic  and  Guidance  Theory  Division  of  the  Aero -Astro - 
dynamics  Laboratory  of  the  Marshall  Space  Flight  Center  is  examining  the 
role  of  "large  computers"  as  they. may  be  exploited  in  the  control  and 
guidance  of  missile  performance.  Under  Contract  No.  NAS8-5365  the  Georgia 
Institute  of  Technology  and  its  Rich  Electronic  Computer  Center  have  been 
studying  such  exploitation  as  it  applies  to  the  approximation  of  guidance 
functions  with  multivariate  functional  models.  Under  this  contract 
attention  so  far  has  been  focused  on  methods  to  reduce  the  computational 
and  variable -selection  problems  in  least  squares  models. 

Background 

The  state  vector,  x(t)  (describing  the  flight  of  a missile  through 
space)  has  the  derivative  x(t).  These  vectors  along  with  a vector  des- 
criptive of  the  guidance  function,  u(t),  satisfy  equations  of  motion, 
which  may  be  expressed  formally  as 

F[*(t),  x(t),  t,  u(t)]  = 0 

The  missile  Is  intended  to  satisfy  certain  mission  requirements  at  some 
future  time,  t , and  ve  may  indicate  these  requirements  in  the  equations 
describing  terminal  conditions: 

G[x(tc),  ±(te),  tc]  = 0 

Note  that  the  functions  F and  G are  themselves  vectors.  The  guidance  *- 
problem  may  be  expressed  generally  as  that  of  choosing  a "best"  guidance 
function  u out  of  the  class  of  possible  guidance  functions.  In  particular 
we  may  wish  to  choose  a function  u in  such  a way  as  to  minimize 


In  practical  situations  with  real  missiles  we  could  not  use  the 
exact  optimum  guidance  function  as  a function  of  time  because  of  measurement 
errors  and  so  on.  The  missile  strays  from  the  optimum  path  into  a situation 
for.  which  the  chosen  guidance  function  is  no  longer  best.  It  then  becomes 
necessary  to  calculate  a new  optimum  guidance  function  based  on  new  initial 
conditions.  In  short  it  is  Important  to  be  able  to  synthesize  the  optimal 
guidance  function,  u,  in  terms  of  the  state  variables  at  each  point  in 
the  phase  space. 

One  approach  to  this  synthesis  which  has  been  proposed  consists  in 
selecting  a scatter  of  initial  points  (possibly  organized  in  subregions 
of  the  phase  space);  using  a large-scale  computer  to  determine  the  corre- 
sponding values  of  the  optimal  guidance  function;  and  then  using  some 
approximation  technique  to  estimate  the  guidance  function  as  a function 
of  the  state  of  the  missile. 

Various  considerations,  both  practical  and  theoretical,  suggest  that 
such  an  approximation  be  based  on  the  criterion  of  "least  squares."  Even, 
however,  if  attention  is  restricted  to  this  well-known  method,  difficulties 
arise.  In  the  first  place  fitting  a function  of  several  variables 
becomes  very  quickly  a huge  matrix  inversion  problem.  In  an  earlier 
study  done  under  this  contract,  entitled:  "Least  Squares  Estimation  of 

Regression  Coefficients  in  a Special  Class  of  Polynomial  Models,"  tech- 
niques were  described  which  reduced  the  large  inversion  problem  to  a 
sequence  of  low-order  inversions,  when  fitting  balanced  polynomials 
to  rectangular  grids  of  data.  While  these  techniques  hold  promise  in 
special  circumstances,  evidently  they  have  a limited  usefulness. 

A second  major  difficulty  in  least  squares  approximations  arises  in 
deciding  which  class  of  functions  or  which  subset  of  a very  large  class 
of  estimation  variables  will  be  used  to  approximate  the  unknown  function. 
Evidently,  a method  which  elects  a relatively  few  highly  efficient 
estimation  variables  also  serves  to  keep  the  matrix- inversion  problem  under 
control,  since  that  computation  depends  directly  on  the  number  of  estima- 
tion variables  used. 


It  happens  that  there  is  a method  available  by  means  of  which  the 
incorporation  of  estimation  variables  into  the  approximating  functions 
can  be  sequenced  in  what  seems  usually  to  be  an  efficient  manner.  We 
shall  call  this  formal  procedure  for  activating  estimation  variables 
simply  the  step-up  procedure.  The  procedure  appears  first  to  have  been 
used  by  R.  J.  Wherry  (Annal.  of  Math.  Stat . , 1931)  • More  recent  dis- 
cussions have  appeared  by  H.  E.  Anderson  and  B.  Fruchter  ( Psvchometrika, 
i960),  and  E.  F.  Schultz,  Jr.  and  J.  F.  Goggans  (Bulletin  of  the  Agri- 
cultural Exp.  Station,  Auburn  Univ.,  1961)  • Since  examples  can  be 
constructed  to  show  that  the  step-up  procedure  is  not  always  optimal, 
the  difficult  problem  of  assessing  its  merit  arises. 

The  primary  concern  of  this  report  is  to  consider  the  merits  of  the 
step-up  procedure,  to  seek  improvement  in  it  and  to  investigate  rules 
to  govern  the  stopping  of  the  selection  procedure. 

While  this  and  related  problems  are  of  considerable  interest  and 
pertinence  in  the  overall  trajectory  problem,  they  should  not  be  consid- 
ered overriding.  Other  approaches,  where  the  goodness  of  approximation 
is  more  directly  related  to  the  cost  criterion  or  to  the  equations  of 
motion  and  where  the  mission  fulfillment  is  more  directly  imposed,  show 
at  least  equal  promise  and  are  being  considered  for  subsequent  study. 

Objectives 

1.  To  conduct  empirical  investigation  of  the  efficacy  of  using  the 
step-up  ‘procedure  in  the  selection  of  a fixed  number  of  estimation 
variables  out  of  a larger  number  in  obtaining  functional  approximations 
by  the  method  of  I£. 

2.  To  seek  modifications  of  the  procedure  for  the  purpose  of 
enhancing  its  efficiency. 

3.  To  develop  reasonable  rules  which  will  control  the  process  of 
stopping  the  estimation  variables  selection  procedure  and  to  study 
empirically  the  sensitivity  of  the  efficiency  of  the  estimation  to 
variations  in  these  rules. 

k . To  explore  empirically  the  general  applicability  of  low-degree 
polynomial  approximation  (in  the  sense  of  least  squares)  to  representative 
functions  of  several  variables. 


5.  To  develop  an  efficient,  flexible  and  unified  computer  program 
which,  in  carrying  out  a least  squares  approximation,  at  least  has  the 
option  of  utilizing  such  selection  procedures  and  stopping  rules  as 
have  been  developed. 

Plan  of  Research 

To  accomplish  the  aims  of  this  part  of  the  study  research  was  or- 
ganized in  four  phases: 

A.  A review  of  the  geometry,  linear  algebra  and  statistics  involved 
in  the  method  of  least  squares  and  the' step-up  procedure.  This  phase 
extended  to  include  discussions  of  modifications  to  the  step-up  pro- 
cedure and  various  criteria  for  stopping  the  selection  process.  Also 
included  were  algorithms  for  computer  programs. 

B.  Development  of  the  structure  of  the  empirical  investigations. 

In  this  phase  decisions  were  reached  on  types  of  functions  to  be  estimated, 
data  patterns,  size  of  data  base,  specific  form  of  the  estimation  variables 
(as  functions  of  independent  variables),  how  data  would  be  obtained  and 
reduced  to  the  regression  format  with  particular  regard  to  the  important 
case  of  polynomial  approximation. 

C.  Development  of  computer  programs.  In  this  phase  algorithms  devel- 
oped in  preceding  phases  were  converted  to  programs,  with  attention  to 
computational  efficiency  and  cost . 

D.  A battery  of  examples  with  interpretations  and,  if  possible, 
conclusions.  In  this  phase  a few  preliminary  examples  were  designed 
to  test  the  efficiency  of  using  the  step-up  procedure.  Later,  more 
sophisticated  examples  were  used  to  develop  the  other  objectives  cited 
above . 

Summary 

A.  Mathematical  review  (see  the  supporting  study  titled:  "Selection 

of  Significant  Estimation  Variables  in  a Least  Squares  Problem: 
Mathematical  Review.") 
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The  well-known  method  of  least  squares  (LS)  is  invoked  to  estimate 
a presumed  functional  relationship  between  a dependent  variable  Y and 
a set  of  independent  variables  on  the  basis  of  a set  of  ob- 

served points.  According  to  the  method  a class  of  functions  of  the  form, 

ao  + ai  Z1(X1,...,X7r)  + ...  + ap  Zp(X1,...,X7r), 

is  considered  for  all  real  sets  of  coefficients.  The  Zls  are  specified 

estimation  variables  depending  on  the  independent  X!s.  For  any  function 

of  the  above  class,  corresponding  to  an  observed  vector  of  X*s,  one 

could  compute  values  z ...,‘z  of  the  estimation  variables  and  a value 
A MP 

y = a 4*a_z_  + ...+az  , which  could  be  compared  with  the  corre- 

|i  o 1 |il  p |ip7  * 

sponding  observed  value  y^  of  the  dependent  variable  Y.  From  this 
specified  class  of  functions  the  method  of  I£>  selects  one  for  which  is 
minimized  the  sum  of  squares  of  the  deviations  of  the  so-called  predicted 
values  y from  the  observed  values  y . Such  a function  is  called  a 
best  estimate  or  best-fitting  approximation  (in  the  class)  in  the  sense 
of  LS. 

The  choice  of  the  functions  to  be  used  as  the  estimation  variables, 
Zp,  . ..,Z  , is  open,  giving  the  method  great' flexibility,  but  also  making 
it  vulnerably  dependent  on  the  choice.  In  the  next  section  of  this 
summary  some  discussion  is  devoted  to  the  choice  of  Z’s  and  the  reduction 
of  data  to  the  form  of  observation  vectors  (y  ,z  z ) on  the 

WfT  |il'  ' lip7 

variables  (Y,Z^,  . . .,Z^) . This  form  is  now  assumed. 

The  least  squares  approach  admits  of  an  accessible  geometrical 
interpretation.  Supposing  there  are  N observation  vectors,  for  each 
estimation  variable  Z^  consider  the  N observed  values  (adjusted  to  the 
mean).  These  values  constitute  the  i-th  estimation  vector  z_^.  Similarly, 
consider  the  mean  adjusted  dependent -variable  vector  y.  The  LS  pro- 
blem translates  to  finding  that  vector  in  the  space  spanned  by  the  esti- 
mation vectors  which  lies  closest  to  the  y vector.  Or  it  may  be  inter- 
preted as  finding  the  projection  of  the  y vector  onto  the  estimation 
space . 


The  cosine  of  the  angle  between  the  y vector  and  its  projection 
in  the  estimation  space  is  called  the  multiple  correlation  coefficient, 

R.  It  is  a measure  of  the  efficiency  of  the  estimate,  attaining  a 
maximum  of  unity  when  the  y vector  coincides  with  the  projection  estimate. 

The  'difference  between  the  y vector  and  its  projection  onto  the 
estimation  space  is  called  the  error  vector.  A pythagorean  property 
holds,  expressing  the  square  of  the  length  of  the  y vector  as  the  sum 
of  squares  of  the  lengths  of  the  estimate  and  the  error.  The  estimate 
itself  can  be  resolved  into  orthogonal  components,  and  the  same  is 
true  of  the  error  vector. 

If  only  k out  of  the  p available  estimation  vectors  are  to  be 
used  to  estimate  y (corresponding  to  selecting  k out  of  the  p possible 
estimation  variables),  a difficult  problem  of  deciding  which  k to  elect 
arises,  since  trying  all  combinations  is  ordinarily  computationally 
infeasible. 

The  step-up  procedure  is  a practical,  though  not  always  perfectly 
optimal,  way  to  select  k estimation  vectors.  It  evolves  naturally  from 
the  geometric  model  described  above.  In  this  procedure  the  first 
estimation  vector  is  chosen  by  finding  the  one  on  which  the  y vector 
has  the  longest  projection  (by  the  pythagorean  property  this  leaves  the 
shortest  error  vector).  In  the  next  step  for  each  of  the  remaining 
vectors  it  is  easy  to  determine  the  length  of  a component  orthogonal 
to  the  first  vector  chosen,  whose  square  added  to  the  square  of  the 
projection  of  y on  the  estimation  space  of  these  two  vectors.  Selected 
is  the  vector  having  the  longest  such  component.  The  procedure  is  then 
repeated. 

Since  the  y vector  may  lie  in  the  plane  of  two  vectors  but  possibly 
closer  to  a third  vector  (not  in  the  plane),  the  step-up  procedure  is 
not  always  optimal,  for  it  would  activate  the  third  vector  first,  then 
one  of  the  others,  but  the  combination  would  not  be  as  efficient  as 
the  first  and  second. 

A modification  of  the  procedure  has  been  incorpo rated  to  allow  for 
the  elimination  of  a vector  from  the  active  estimation  set.  It  works  in 
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the  following  way.  The  error  vector  for  the  k selected  variables  is 
compared  with  the  error  vector  when  one  vector  is  deleted  from  the 
active  estimation  set.  The  difference  measures  the  net  reduction  of 
error  due  to  the  one  vector  deleted.  Computationally  it  is  easy  to 
compare  the  lengths  of  these  reduction?.  One  may  wish  to  eliminate  a 
variable  which  contributes  little  net  reduction.  A measure  of  the  net 
reduction  due  to  each  estimation  vector  is  provided  by  the  cosine  of 
the  dihedral  angle  -formed  by  the  plane  containing  the  y vector  and 
its  projection  in  the  reduced  estimation  space,  on  the  one  hand,  and 
the  plane  containing  the  two  projections,  on  the  other  hand.  This  is 
called  the  partial  or  net  correlation  coefficient  between  the  dependent 
variable  y and  the  estimation  variable  in  question. 

It  appears  evident  that  the  simple  rule  of  selecting  k of  p estimation 
vectors  will  not  always  be  a good  stopping  rule.  From  the  geometrical 
description  several  other  natural  criteria  emerge  as  possible  stopping 
rules  whose  use  may  be  varied  according  to  considerations  of  the  particular 
problem  at  hand.  For  example,  if  the  multiple  correlation  coefficient 
is  "very  high"  the  addition  of  other  variables  may  seem  unnecessary.  Again, 
even  if  R is  not  high,  the  modified  step-up  procedure  may  be  making  no 
appreciable  improvement  in  the  estimate  so  that  further  addition  of 
variables  to  the  active  estimation  set  may  be  deemed  useless.  Also, 
depending  on  the  criteria  for  continuing  to  bring  in  new  variables  and 
to  eliminate  old  ones,  some  stopping . rule  should  be  available  to  guard 
against  cycling. 

The  most  difficult  choices  for  these  decision  rules  are  those 
concerning  whether  to  eliminate  an  active  estimation  vector  and  whether 
adding  one  or  several  more  will  make  any  significant  reduction  in  the 
error  vector.  One  might  adopt  the  rule  of  introducing  two  vectors  and 
eliminating  one,  until  a stopping  rule  stops  the  process.  One  might 
eliminate  the  vector  to  which  corresponds  the  lowest  net  correlation 
coefficient,  provided  that  the  coefficient  reaches  a certain  "low" 
value.  One  might  stop  adding  vectors  if  the  last  r added  make  an 
average  addition  to  R of  less  than  some  fixed  amount.  However,  caution 


should  he  exercised  in  the  fixing  of  criteria,  since  certain  combinations 
of  these  rules  increase  the  chances  of  cycling. 

Finally,  we  have  considered  elimination-stopping  rule  combinations 
based  of  F statistics.  Briefly,  an  F statistic  is  a ratio  of  the  average 
of  certain  of  the  estimation  components  to  the  average  of  the  error  com- 
ponents. In  a statistical  context,  if  the  estimation  components  have 
on  the  average  the  same  length  as  the  error  components,  they  are  con- 
sidered insignificant  and  are  attributable  to  random  eiror.  In  short 
these  vectors  are  not  considered  of  estimative  significance.  From 
such  a point  of  view  there  is  some  intuitive  appeal  in  the  decision  rule: 
Do  not  add  if  F < 1;  do  not  drop  if  F ^ 1.  However,  the  rationale 
for  using  the  F statistic  rules  is  tenuous  and,  such  as  it  is,  depends 
on  hypotheses  of  a statistical  model  which  are  not  always  appropriate. 

A fuller  discussion  of  the  statistical  model  is  given  in  the  supporting 
study. 

While  the  mathematical  and  statistical  analysis  suggested  the  fore- 
going procedures  and  rules,  it  has  also  indicated  considerable  need  for 
the  empirical  tests  subsequently  made. 

The  mathematical  analysis  included  a translation  of  the  geometrical 
steps  described  above  into  algorithms  capable  of  being  converted  to  com- 
puter programs.  These  well-known  algorithms  also  are  developed  in  detail 
in  the  supporting  study  with  every  effort  made  to  retain  geometrical 
interpretations  in  the  development. 

B.  Structure  of  the  Empirical  Investigations 

The  data  were  organized  in  two  main  phases.  The  purpose  of 
empirical  runs  in  the  first  phase  was  primarily  to  gain  insight  on  the 
efficiency  of  the  step-up  method  for  activating  a subset  of  estimation 
variables  out  of  a large  set  of  such  variables.  The  principal  aim  of 
the  runs  in  the  second  phase  was  to  explore  the  relative  merits  of  various 
rules  for  stopping  the  step-up  procedure  of  adding  variables  to  the 
active  estimation  set.  and  rules  for  eliminating  such  variables.  Auxiliary 
purposes  of  empirical  runs  were  to  test  and  correct  pertinent  computer 
programs  and  to  obtain  from  diversified  experience  an  idea  of  the  general 
validity  of  the  LS  approach  as  an  approximation  technique. 
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As  pointed  out  in  the  previous  section,  the  generality  of  the  method 
of  IB  leaves  considerable  latitude  in  the  selection  of  test  cases.  In 
organizing  test  runs  representing  a variety  of  problem  types  some  of 
the  factors  on  which  decisions  had  to  be  reached  included: 

1.  The  type  of  function  to  be  approximated,  including  its  form, 
the  number  of  variables  and  the  selection  of  a representative 
member. 

2.  The  class  of  approximating  functions,  i.e.,  a selection  of 

the  estimation  variables  Z^  = Z^X^, — ,XTT),  i = 1,2,  ...,p, 
where  (X^  presumably  is  in  the  domain  of  the  function  to 

be  approximated. 

3.  The  number,  extent  and  distribution  of  data  points. 

Admittedly  decisions  reached  during  the  test  construction  concerning 
these  factors  were  somewhat  arbitrary.  They  were  made,  however,  with 
awareness  of  their  significance. 

Briefly,  it  was  decided  to  construct  data  for  a few  selected  functions 
of  three  variables,  using  a rectangular  grid  of  data  and  balanced  poly- 
nomials as  approximating  functions.  In  addition,  a few  runs  were  made 
using  active  data,  which  were  developed  in  certain  statistical  regression 
analyses.  Except  for  the  actual  data  runs  the  data  grids  consisted  of 
500  to  1000  points  generated  from  evenly  spaced  values  of  the  three 
variables  on  the  margins.  Thus  the  undoubtedly  important  effects 
(on  goodness  of  fit)  of  varying  the  distribution  of  data  points  or 
varying  the  types  of  estimating  functions  were  not  studied  here.  In- 
deed these  factors  were  held  more  or  less  constant  in  order  not  to 
obscure  the  comparisons  of  variable-selection  proceudres. 

These  decisions  led  to  fairly  general  and  easy  algorithms  for 
generating  data  for  a given  test  run  and  reducing  them  to  the  format  of 
LS  input.  Thus,  for  a given  function  f(X1,X2,X^)  = Y,  a given  class  of 
balanced  polynomials  of  the  form 
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~ Yl  ^3 

Y‘E  \*Z*3X1  X2%  ’ 

and  a given  rectangular  grid  of  points, 

^xlt1,x2t2,x3t3^ 


observation  vectors  (y^,  z^,  z^,  . ..,  z^)  were  generated  by  the  computer. 

Here  y is  the  value  of  Y at  some  (x  ,xQ  jxo+  ) > and  the  estimation 
p*  Iti  ^2  '^3 

variables  Z_^  are  the  several ; terms  of  the  balanced  polynomial  of  the  form 


zi  = xi 


X. 


while  z . is  the  value  of  Z.  when  (Xn,X0,X0)  - (xn,  ,x0,  ,x_  ).  The 
p,i  i 1*  2}  3 It  ’ 2t  ’ 3t 

observation  vectors  were  then  in  a form  to  obtain  Ls  estimates  of  the 
coefficients  in  the  best-fitting  balanced  polynomial,  or  more  specifically 
to  manipulate  in  a way  aimed  at  activating  the  most  significant  estimative 
terms  of  the  balanced  polynomial  as  described  in  the  foregoing  section. 

Runs  in  the  first  phase  were  limted  to  estimating  a polynomial  (of 
higher  order  than  the  approximating  ones)  and  estimating  a rational  function, 
while  the  approximating  balanced  polynomial  class  was  restricted  to  be  of 
second  degree  in  X^  and  X^  and  first  degree  X^,  which  restricted  the  number 
p of  estimation  variables  (terms  of  the  polynomial)  to  17  or  less.  The 
test  procedure  for  these  runs  was,  for  each  k = 1,2,  ...,p-l,  to  determine 
the  efficiency  (multiple  correlation)  of  each  of  the  (^)  subsets  of  k 
vectors  and  compare  the  optimal  set  with  the  set  produced  by  the  step-up 
procedure.  Computer  time  was  a limiting  factor  in  these  tests. 

Runs  in  the  second  phase  included  estimating  an  exponential  function 
and  a few  algebraic  functions  other  than  rational  functions,  and  they  in- 
cluded two  runs  using  actual  statistical  data.  Some  effort  was  made  to 
include  poorly  fitted  functions  as  well  as  accurately  fitted  ones.  Also, 
the  form  of  the  approximating  balanced  polynomial  was  stepped  up  to  develop 
47  estimation  variables.  Usually,  for  each  example,  several  runs  were 
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initiated  in  which  were  varied  the  policies  of  stopping  the  selection 
procedure  or  of  eliminating  a variable. 

Considered,  but  not  developed  in  this  study,  was  an  experimental 
design  in  which  runs  would  be  made  for  the  various  different  combinations 
of  prescribed  levels  of  the  main  factors  thought  to  influence  efficient 
variable  selection. 

C.  Development  of  Computer  Programs  (see  the  supporting  study 
titled,  "Selection  of  Significant  Estimation  Variables  in  a 
Least  Squares  Problem:  Computer  Programs.") 

Corresponding  to  the  two  phases  of  the  study  mentioned  in  the 
last  section,  two  computer  programs  were  developed.  The  purpose  of  the 
first  program  was  to  compare  in  a few  examples  the  subset  of  k estimation 
vectors  selected  by  the  step-up  procedure  with  the  optimal  subset  of  k. 
This  first  phase  of  programming  was  begun  before  the  Burroughs  5000  was 
operational  on  contractor  facilities  and  was  programmed  in  the  ALGOL  58 
compiler  language  for  the  Burroughs  220  computer.  Because  of  core 
memory  limitations  the  program  restricts  the  total  number  of  estimation 
vectors  to  twenty-five.  It  would  be  a simple  matter  to  translate  the 
program  to  one  for  the  more  advanced  computer.  This  has  not  yet  been 
done,  primarily  because  the  number  of  comparisons  to  be  made  even  with 
the  restriction  to  25  variables  makes  for  an  almost  prohibitive  amount 
of  computation  time. 

The  program  depends  on  using  (l)  rectangular  grid  data  and  (2) 
a balanced  polynomial  as  the  general  form  of  the  approximating  function. 
One  part  of  the  program,  using  as  input  the  specified  values  of  each  of 
the  variables  and  the  degree  of  the  balanced  polynomial  in  each  variable, 
generates  internally  the  grid  of  data  points  and  computes  for  each  such 
point  the  value  of  each  term  of' the  balanced  polynomial.  Thus  the 
estimation  vectors  are  generated. 

Also  the  program  allows  for  a procedure  to  be  inserted  to  incorpo- 
rate the  computation  of  the  values  of  the  function  which  is  to  be  approx- 
imated, at  each  of  the  grid  points  of  data.  Thus  the  dependent  variable 
vector  y is  generated. 


As  an  intermediate  calculation  the  program  mean  adjusts  the  above 
vectors  and  produces  the  intercorrelation  matrix  for  all  the  vectors, 
including  the  dependent  variable  vector.  There  will  be  LjLg. . .L^=  p + 1 
such  vectors.  These  are  restricted  in  number  to  25. 

In  the  next  part  of  the  program,  for  each  k = 2, 3, ... , p-1,  each 
one  of  the  (£)  subsets  of  k estimation  vectors  is  manipulated  to  compare 
the  estimation  efficiency  (multiple  correlation)  of  those  subsets.  For 
each  k the  subset  of  k vectors  which  gives  maximum  efficiency  is  printed 
as  is  also  its  corresponding  multiple  correlation  coefficient. 

In  the  final  part  of  the  program  the  estimation  vectors  are  se- 
lected in  the  order  prescribed  by  the  step-up  procedure.  At  each  stage 
an  index  of  the  estimation  vector  introduced  at  that  stage  is  printed 
out,  as  well  as  the  multiple  correlation  coefficient  obtained  with  the 
set  of  vectors  selected  up  to  that  stage. 

In  this  program  checks  were  instituted  to  restrain  the  incorporation 
of  vectors  which  were  practically  dependent  on  vectors  already  included 
in  the  active  estimation  set.  Also,  considerable  effort  was  made  to 
abbreviate  the  matrix- inversion  type  calculations  in  order  to  produce 
only  the  multiple  correlation,  since  the  number  of  such  calculations, 

2?  - p - 2,  rapidly  gets  large. 

The  purpose  of  the  second  program,  to  a considerable  extent 
based  on  the  assumption  that  the  step-up  procedure  was  reasonably 
efficient,  was  to  make  available  a fairly  flexible  program  for  esti- 
mations based  on  the  method  of  LS  in  which  would  be  included  at  least 
options  for  activating  subsets  of  the  esimation  variables  according  to 
the  step-up  procedure  and  other  modified  procedures,  and  also  included 
would  be  options  which  could  be  exercised  to  stop  the  selection.  The 
program  was  done  for  the  Burroughs  5000  in  the  ALGOL  60  compiler  lan- 
guage. 

As  it  now  stands  the  program  has  several  outions  for  obtaining  the 
basic  matrix  of  the  dot  products  of  the  adjusted  vectors  (which  matrix 
reduces  to  the  intercorrelations  matrix  when  the  rows  and  columns  are 
appropriately  standardized). 


(1)  One  of  these  options  is  the  same  as  in  the  previous  program, 
except  that  the  admissible  order  of  the  matrix  has  now  been 
increased  to  more  than  100.  This  option  allows  for  the  rapid 
generation  of  data  for  experimental  studies. 

(2)  Either  the  matrix  of  dot  products  or  the  intercorrelation 
matrix  may  be  read  in  directly.  This  allows  further  study, 
especially  of  subset  selection  procedures,  of  previously 
studied  regression  problems,  least  squares  fittings,  and  so 
forth. 

(3)  Observation  vectors  may  be  directly  read  in.  This  will  be 
the  way  data  will  arise  in  most  realistic  problems,  although 
values  of  the  estimation  variables  may  require  preliminary 
transformation  (e.g.,  if  the  estimation  variables  are  terms 
in  a balanced  polynomial). 

In  this  program,  once  the  basic  matrix  has  been  obtained,  it  is 
retained  in  memory  and  can  be  used  over  and  over,  to  facilitate  compar- 
isons when  various  procedures  for  selection,  elimination,  and  stopping 
are  employed. 

In  case  the  intercorrelations  matrix  was  not  introduced  directly 
the  program  gives  an  option  for  computing  and  printing  it  and  using  it 
in  the  remainder  of  the  program. 

In  the  main  part  of  the  program  estimation  vectors  are  introduced 
in  the  'priority  order  dictated  by  the  step-up  procedure.  In  addition, 
however,  the  procedure  carries  options  which  allow  for  various  rules 
to  be  set  to  make  possible  the  elimination  of  an  estimation  vector  and 
the  stopping  of  the  selection  process. 

At  present  there  are  two  criteria  either  one  of  which  may  be  used 
to  eliminate  an  estimation  variable.  One  option  automatically  eliminates 
an  estimation  variable  after  two  have  been  included.  Of  course  the  one 
deleted  is  the  one  of  lowest  net  correlation  with  the  dependent  variable 
(see  Section  A preceding).  In  the  other  option  the  pertinent  F statistic 
for  the  variable  with  smallest  net  correlation  is  computed  (see  Section  A) 


and  is  tested  against  a preassigned  threshold  value.  If  it  is  below  this 
value,  the  variable  is  deleted.  It  is  possible  to  prevent  any  such 
eliminations  by  setting  the  threshold  equal  to  zero. 

Currently  there  are  four  criteria  which  can  be  used  to  stop 
the  process  of  adding  estimation  variables.  The  program  effectively  per- 
mits bypassing  any  or  all  of  these  criteria.  They  are: 

(1)  Stop  if  the  F ratio  for  the  next  single  variable  to  be  intro- 
duced does  not  exceed  that  threshold  value  corresponding  to  a 
preassigned  significance  level.  The  procedure  stops  after 
that  estimation  variable  has  been  added.  This  can  be  bypassed 
by  setting  the  threshold  at  zero. 

(2)  Stop  if  the  current  value  of  the  multiple  correlation  coefficient 
is  sufficiently  large.  This  can  be  bypassed  by  setting  the 
multiple  correlation  threshold  at  unity. 

(3)  Stop  if  the  number  of  variables  chosen  reaches  a preassigned 
number.  This  can  be  bypassed  by  setting  that  number  equal 
to  the  total  number  available. 

( 4)  Stop  when  the  number  of  computational  iterations  for  adding 
or  eliminating  a vector  has  exceeded  a preassigned  number. 

It  is  noteworthy  that  the  computational  procedures  for  eliminating 
and  for  adding  a vector  are  the  same,  once  the  vector  has  been  earmarked. 

It  should  also  be  mentioned  that  the  same  precautions  as  in  the  earlier 
program  were  taken  to  prevent  the  introduction  of  almost  linearly  dependent 
vectors . 

In  this  program  of  course  the  output  includes  the  LS  regression  co- 
efficients of  the  selected  estimation  vectors,  as  well  as  indices  of  the 
vectors  selected,  and  the  multiple  correlation  coefficients. 

D.  Test  Runs  on  the  Computer  (see  the  supporting  study  titled, 
"Selection  of  Significant  Estimation  Variables  in  a Least 
Squares  Problem:  Empirial  Computer  Studies.") 

As  indicated  in  previous  sections,  these  tests  were  broken  roughly 
into  two  phases.  In  a very  limited  way  the  preliminary  set  of  tests  was 
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conducted  to  gain  a measure  of  confidence  in  the  step-up  procedures  as 
a means  for  selecting  an  efficient  subset  of  estimation  variables  in 
in  a least  squares  model.  In  the  tests  made  a "balanced  polynomial  of 
relatively  low  order  was  selected,  the  terms  of  which  provided  the  full 
set  of  estimation  variables.  Estimation 'vectors,  as  well  as  a dependent- 
variable  vector,  were  generated  from  rectangular  design  data.  Dependent- 
variable  data  were  computed  as  values  of  the  function  which  was  to  be 
approximated.  As  described  previously,  subsets  of  estimation  vectors 
selected  by  the  step-up  procedure  were  compared  with  the  optimal  set. 
Primary  difficulty  in  test  runs  arose  from  fact  that  the  determination 
of  the  actual  optimal  set  of  k vectors  required  comparisons  of  (ip)  sets 
of  vectors,  where  p was  total  number  of  estimation  variables  available. 
Computational  feasibility  dictates  that  p be  severely  restricted. 

Nevertheless,  several  preliminary  runs  were  made  where  p was  kept 
to  about  11,  and  in  all  cases  less  than  18.  Several  functions  were  approx- 
imated. These  in  general  represented  the  class  of  rational  functions. 

For  one  of  the  functions,  which  had  a pole  in  the  region  of  data  points, 
only  a poor  approximation  was  obtained.  Otherwise,  even  with  low-degree 
polynomials,  the  multiple  correlation  coefficient  was  rather  high. 

In  most  of  these  tests  the  step-up  procedure  selected,  at  each 
stage,  the  optimal  set  of  variables.  There  was  one  example,  however, 
where  the  procedure  did  not  select  the  optimal  set  of  two  vectors,  al- 
though the  correct  selection  of  a larger  number  of  variables  was  achieved. 
It  is  also  noted  that,  when  R became  stable  or  nearly  so,  additional 
variables  introduced  by  the  step-up  procedure  were  not  always  optimal. 

It  is  possible  that  this  could- have  been  the  result  of  round-off  error. 

In  general  these  experimental  results  indicated  the  step-up  pro- 
cedure is  probably  quite  efficient,  at  least  when  a fair  scatter  of 
points  is  available.  It  was  noted  that,  even  when  the  method  failed, 
the  value  of  R was  near  optimal.  The  actual  occurrence  of  failures, 
even  at  early  stages,  suggested  that  some  means  for  eliminating  var- 
iables would  be  desirable.  Such  techniques  were  introduced  and  used  in 
the  second  phase  of  testing. 


For  the  second  set  of  test  runs  the  Burroughs  5000  program  was 
used.  As  mentioned  earlier,  this  program  allows  for  a larger  number  of 
estimation  vectors  to  be  handled,  incorporates  options  of  data  input, 
variable  elimination  and  program  stops,  but  does  not  make  the  comparisons 
to  determine  a purely  optimal  subset  of  estimation  variables.  In  most 
of  the  examples  studied  in  this  phase  several  runs  were  made  for  each 
example  to  throw  light  on  the  effects  of  changing  the  pattern  of 
variable  elimination  and  stopping  rules.  Attention  was  focused  on 
varying  the  elimination  criterion,  the  effects  of  varying  other  rules 
being  discernible  from  the  print-out,  with  the  principal  basis  for 
elimination  being  an  F statistic  (see  Section  A of  Summary).  To  observe 
the  effect  of  certain  stopping  rules  (which  can  be  set  in  the  program 
options)  print-out  includes  for  each  "sweep”  (where  a variable  is  elimi- 
nated or  added  to  the  estimation  set)  the  number  of  sweeps  up  to  that 
stage,  the  number  of  estimation  variables  being  used,  an  index  of  the  last 
one  eliminated  or  added,  the  F^_  value  of  the  F statistic  for  a variable 

brought  in  or  the  Fn  value  of  the  F statistic  corresponding  to  a variable 
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being  eliminated  (if  it  was  below  the  criterion  level),  and  the  square  R 

2 

of  the  multiple  correlation  coefficient,  as  well  as  the  reduced  R which 
diminishes  if  and  only  if  the  last  variable  introduced  gave  an  F^  value 
less  than  unity. 

The  examples  included:  Approximating  three  non -polynomial  functions, 

with  the  available  variables  being  the  h8  terms  of  a balanced  polynomial 
cubic  in  X^  and  and  quadratic  in  X^  and  the  500  data  points  generated 
from  Xx  = 0.25(0.25)2.50,  X2  = 0.25(0.25)2.50  and  X3  = 0.25(0.25)1.25; 
approximating  a dependent  variable  from  actual  data  with  available 
variables  constituting  a balanced  polynomial  in  four  variables,  where 
the  data  are  (as  would  usually  be  the  case  in  practice)  not  in  rectangular 
design;  and  approximating  a dependent  variable  from  actual  data  where 
the  intercorrelation  matrix  of  available  estimation  vectors  was  given, 
the  presumption  being  that  these  could  be  non-polynomial  terms. 

In  the  first  group  of  examples  the  functions  chosen  to  be  approx- 
imated were 
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= (xx4  + x23  + x32) | 
= / xx2  + x22  + x32. 


Xl  + X2 


1_ 
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As  in  all  examples  the  data  were  mean-adjusted.  The  functions  F^  and  F^, 

especially  F-,  were  very  closely  approximated  (in  the  range  of  data)  by 

j 2 

the  full  set  of  47  estimation  vectors  in  the  sense  that  R was  near  unity, 

2 

while  R for  the  case  of  F^  was  near  0-9*  For  each  example  runs  were 

made  with  FQ  set  over  a range  of  values  from  high  to  low.  In  the  case 

where  FQ  was  set  very  low  the  tendency  was  to  eliminate  few  or  no  variables 

and  thus  to  be  very  close  to  the  simple  step-up  procedure. 

The  test  runs  for  these  examples  show  that  different  subsets  of 

estimation  variables  will  be  selected  when  the  elimination  (and  stopping) 

rules  are  varied.  They  provide  concrete  examples  wherein  the  step-up 

procedure  is  bettered  by  a procedure  modified  to  include  an  elimination 

criterion;  where  the  opposite  happens;  where  an  F^  stopping  criterion  of 

1.00  (on  the  last  variable  brought  in)  could  stop  the  procedure  which  if 

continued  would  later  introduce  variables  significant  at  this  same  level. 

These  test  runs  suggest,  but  not  markedly  or  universally,  that  the  elimi- 
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nation  criterion  is  effective  in  obtaining  a higher  R for  the  same  number 

of  estimation  variables;  that  a high  criterion  value  is  more  effective  for 

variables  selected  early  but  not  for  those  selected  later;  that  the  F^  test 

may  stop  the  procedure  too  soon  unless  modified;  that  different  problems 

seem  to  need  somewhat  different  rules;  that  while  the  set  of  variables 

2 

selected  may  vary  considerably  R has  a tendency  to  be  fairly  stable  for 
different  procedures. 

The  examples  with  actual  data  provided  experience  with  data  more  of  the 
type  expected  in  a realistic  problem.  In  addition  the  first  provided  a 
good  example  in  which  an  F stopping  rule  based  on  a single  variable  (last 
introduced)  would  have  stopped  the  procedure  too  soon.  The  last  example 
illustrates  another  point,  viz.  that  out  of  l4  variable  the  last  nine 
variables  tested  together  are  not  significant  at  50$  level  while  the  6th 
one  tested  alone  is  significant  at  this  level. 
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It  should  be  noted  that  in  all  the  examples,  in  terms  of  the  multiple 

correlation  coefficient,  a few  estimation  variables  usually  accounted 

2 

for  most  of  the  value  of  R . 

It  is  recommended  that  further  insight  be  obtained  by  examining  the 
summary  data  for  the  various  test  runs,  given  in  the  supporting  study  refer 
red  to  above. 

Conclusions  and  Recommendations 

The  step-up  procedure,  which  first  activates  the  one  estimation 
variable  best  in  the  sense  of  least  squares,  activates  next  the  one  which 
contributes  the  most  to  a further  reduction  in  the  sum  of  squares 
and  so  forth,  is  supported  as  an  efficient  and  computationally  feasible 
procedure  for  selection  priority- rated  estimation  variables  in  a least 
squares  approximation  problem. 

The  nonoptimality  of  the  procedure  is  manifest  in  practice.  How- 
ever, the  evidence  is  strong  that  even  in  such  case  the  results  are  near- 
optimal,  as  measured  by  the  multiple  correlation  coefficient,  R.  The 
empirical  evidence  indicates  more  reliability  of  the  step-up  procedure 
in  the  activation  of  the  earlier  and  presumably  more  significant  variables 
than  in  later  variables.  When  a large  number  of  estimation  variables 
is  involved,  the  optimal  value  of  R appears  to  be  nearly  reached  by 
several  subsets  of  estimation  vectors.  Thus,  although  frequently  in 
these  cases  the  set  selected  by  the  step-up  procedure  is  not  optimal, 
it  is  very  nearly  so. 

If  it  is  important  to  restrict  the  number  of  estimation  variables, 
there  appears  to  be  a need  for  a means  of  eliminating  variables  previously 
activated.  The  procedure  of  eliminating  an  active  variable  whose  net 
contribution  to  the  reduction  in  the  sum  of  least  squares  is  (and 
small)  is  practicable  and  frequently  effective.  Examples  show,  however, 
that  the  elimination  modification  does  not  always  improve  on  the  simple 
step-up  procedure.  Moreover,  it  carries  the  same  cost  as  activating  an 


estimation  variable.  No  fixed  elimination  criterion  is  best  for  any 
wide  variety  of  problems.  The  experiments  indicated  an  overall  tendency 
for  a large  elimination  criterion  to  be  more  effective  when  the  active 
estimation  subset  is  small  and  a small  criterion  to  be  more  effective 
when  the  number  of  active  estimation  variables  has  become  sizable. 

The  use  of  rules  to  stop  the  activation  of  additional  estimation 
variables  must  often  depend  on  such  factors  as  available  computer  time 
and  rate  of  computer  time  utilization.  A comprehensive  set  of  rules, 
which  may  be  used  in  various  combinations,  includes  stopping  when  R 
is  sufficiently  large,  when  the  activation  of  additional  variables  does 
not  contribute  significantly  to  the  estimation,  when  the  number  of 
variables  reaches  a preassigned  number  or  when  the  computational  pro- 
cedure begins  to  cycle.  Examples  show  that  the  second  of  these  can 
occasionally  stop  the  process  too  soon,  so  that  the  contribution  of  the 
last  several  active  variables,  rather  than  just  the  last  one,  should 
probably  be  tested.  The  speed  with  which  variables  were  eliminated  or 
introduced  in  the  examples  indicates  that  large  blocks  of  variables 
could  be  introduced  before  making  any  decision  on  which  variables  to 
keep  active. 

The  study  shows  that  at  the  current  state  of  computer  science  it  is 
still  infeasible  to  examine  all  combinations  of  subsets  of  estimation 
variables  to  determine  the  optimal  subset,  unless  the  total  number  is 
quite  small,  and  thus  that  the  need  remains  for  such  a procedure  as 
the  step-up  procedure.  The  study  has  also  given  evidence  of  the  fea- 
sibility o-f  the  rapid  selection  of  efficient  estimation  variables  even 
from  a set  of  several  hundred,  using  a fairly  sophisticated  system  of 
optional  variable -elimination  and  stopping  rules. 

Finally,  with  reservation,  it  should  be  noted  that  in  all  the  examples 
there  was  a marked  relative  efficiency  of  a small  set  of  active  esti- 
mation variables  to  the  entire  set  of  estimation  variables  available. 

In  view  of  the  foregoing  results  the  step-up  procedure  is  recommended 
as  an  effective  means  for  selecting  priority-rated  estimation  variables 


in  a least  squares  analysis.  The  use  of  the  modified  procedure  and  the 
various  stopping  rules  is  also  recommended  with  the  admonition  that  the 
various  settings  ought  insofar  as  possible  to  be  adjusted  to  suit  the 
experience  of  workers  familiar  with  the  problem  area  under  study. 

Specifically,  with  regard  to  the  context  of  estimating  optimal 
trajectories,  i.e.  with  regard  to  the  problem  giving  rise  to  this  study, 
it  is  recommended  that  further  general  analysis  of  the  method  des- 
cribed herein,  either  theoretical  or  empirical,  not  be  undertaken,  but 
that  the  method  and  experience  gained  be  applied  in  a series  of  exper- 
iments with  actual  trajectory  data  as  soon  as  possible,  where  the  exper- 
ience of  researchers  in  the  field  and  the  knowledge  of  physics  pertinent 
to  the  problem  will  be  utilized  to  help  delimit  the  class  of  approximating 
functions . 

Finally,  using  methods  of  design  of  experiments  and  a limited  class 
of  functions  presumably  pertinent  to  trajectory  problems  and  including 
some  live  data,  it  may  be  feasible  to  study  the  effects  (on  approximation 
efficiency)  of  varying  certain  factors  such  as  data  distribution,  type 
of  approximation,  elimination  criterion,  and  so  on. 


SELECTION  OF  SIGNIFICANT  ESTIMATION  VARIABLES 
IN  A LEAST  SQUARES  PROBLEM:  MATHEMATICAL  REVIEW 


1.  Introduction . The  principle  of  least  squares  (LS)  can  be 
formulated  in  the  following  terms.  Presumed  to  exist  is  some  sort  of 
functional  dependence  of  one  variable,  Y,  called  the  dependent  variable, 
on  a vector,  (X^,  . . . ,X^) , of  rr  other  variables,  called  independent 
variables.  Available  is  a number  (say  N)  of  observations,  i.e.  values 
of  Y corresponding  to  values  of  the  vector  (X^,  . • .,XTT) . Next  is  chosen  a 
class  of  admissible  functions  of  the  form, 

a_Z_  ( X , . . . , X ) + ...  + a Z (X_,...,X  ),  where  the  are  fixed 
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functions  of  the  X's  and  the  parameters  of  the  class  are  a^,  a^,  ...,a^. 

The  functions  Z^  presumably  are  chosen  to  enhance  the  likelihood  that 

the  unknown  functional  relationship  (between  Y and  the  X's)  will  be  nearly 

of  the  prescribed  form.  Each  function  of  the  class  is  linear  in  the 

variables,  Z_,...,Z  , which  we  shall  call  estimation  variables:  each 
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function  is  also  linear  in  the  parameters.  In  any  case  the  basic  idea 
in  the  least  squares  approach  is  to  approximate  the  unknown  functional 
relationship  with  one  of  the  admissible  functions.  For  any  one  of  the 
functions  in  the  class,  corresponding  to  each  observation,  (x^^,  . . . ,x^) , 


is  the  value  of  the  function,  y = anz  - + 
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, . + a z , where 
[1  J.  fil  p jjpr  ' 

= Z^  (x^, . . which  is  comparable  with  the  value  of  Y (say  y ) 

corresponding  to  this  same  observation,  (x^, . . . ,x^)  * The  sum  of  squares, 
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is  taken  as  a measure  of  the  estimative  value  of  the  function 
Y = a^Z^  + ...  + apZp*  According  to  the  principle  of  least  squares,  out 
of  the  class  of  admissible  functions 
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is  chosen  as  an  estimate  of  Y one  function  which  minimizes  the  sum  of 
squares  of  deviations.  Such  an  estimate  (we  shall  see  that  one  does 
exist)  is  written  as  Y = E?  b.Z.;  we  shall  call  such  a function  a best 
estimate  or  best-fitting  approximation  (in  the  class)  in  the  sense  of 
least  squares.  The  sum  of  squares  of  deviations,  s!J(y^  " Y^)^j 
called  the  sum  of  least  squares  or  the  residual  sum  of  squares  due  to 
error.  The  procedure  of  obtaining  a best  estimate  in  the  above  sum 
is  frequently  called  a regression  analysis,  or  more  properly  a linear 
regression  analysis . The  b^  are  often  called  regression  coefficients . 

The  method  of  L3  was  known  and  used  by  Gauss  over  150  years  ago. 

He  discovered  that  under  certain  conditions  the  method  of  least  squares 
in  a sense  yields  an  optimal  estimate.  This  is  the  famous  Gauss-Markov 
theorem.  Briefly,  the  principal  hypothesis  for  this  theorem  is  that 
except  for  random  deviations  the  observed  values  of  Y are  values  corre- 
sponding to  one  of  the  functions  in  the  class  clJ.  The  random  deviations 
are  assumed  to  be  statistically  uncorrelated,  with  a common  variance  and 
mean  zero.  Under  the  additional  hypothesis  of  normality  of  the  distribution 
of  these  deviations  an  elegant  statistical  theory  of  estimation  and 
hypothesis  testing  can  be  constructed.  The  statistical  model  is  dis- 
cussed briefly  in  Section  5 below. 

The  method  of  LS  is  used  widely  in  numerical  analysis  even  when  the 
support  of  the  Gauss-Markov  theorem  cannot  honestly  be  invoked.  In 
many  cases  other  methods  perhaps  are  equally  or  more  justifiable;  but 
often  the  method  of  LS  has  an  intuitive  appeal  in  that  it  seeks  an 
estimate  which  minimizes  one  obvious  measure  of  error. 

It  is  also  possible  to  consider  classes  of  admissible  functions, 
from  which  an  estimate  will  be  chosen  on  the  basis  of  the  LS  principle, 
which  classes  are  nonlinear  in  the  parameters.  In  many  instances  such 
problems  are  resolved  satisfactorily  by  iterative  techniques.  The 
procedure  of  obtaining  estimates  of  the  parameters  in  such  a case  is 
called  a nonlinear  regression  analysis. 


Excellent  accounts  of  the  statistical  linear  regression  model  are 
given  in  GRAYBILL,  SCHEFFE^,  and  ZELEN.  The  method  of  LS  is  given  space 
in  most  numerical  analysis  books,  and  sometimes  the  nonlinear  case  is 
discussed*  E.g.,  see  SCARBOROUGH*  Nonlinear  regression  analysis  is 
treated  from  a statistical  point  of  view  in  WILLIAMS. 

In  .applications  of  LS  it  is  often  the  case  that  the  number  of 
estimation  variables,  for  which  values  are  computable  from  observations 
on  independent  variables,  is  very  large.  Certain  recurring  and  nagging 
questions  arise,  varying  somewhat  with  the  circumstances.  If  only  k of 
p variables  can  be  used,  which  k should  be  chosen?  Does  the  use  of 
additional  variables  contribute  significantly  to  increased  efficiency  of 
estimate?  The  second  of  these  questions  is  not  mathematically  mean- 
ingful until  the  work  "significantly"  is  defined*  However,  in  the 
context  of  a given  problem,  the  question  is  one  that  frequently  must 
be  raised,  given  meaning  and  acted  on. 

There  is  an  obvious  answer  to  the  first  question  raised  above,  viz., 
to  determine  by  computation  which  of  the  (^)  sets  of  estimation  variables 
yields  the  minimum  sum  of  least  squares  from  the  data.  Unfortunately 
this  straight-forward  procedure  is  computationally  infeasible.  A 
more  tractable  and  completely  reliable  method  of  finding  the  optimal 
set  of  k estimation  variables  remains  an  open  problem.  However,  at 
least  as  early  as  1931*  WHERRY  proposed  a procedure  for  selecting  a 
reasonably  efficient  subset  of  estimation  variables.  This  procedure  we 
call  --  because  it  has  become  our  habit  — simply  the  step-up  procedure. 
It  consists  in  selecting  first  the  one  estimation  variable  best  in  the 
sense  of  LS,  next  the  one  which  contributes  the  most  to  a further  re- 
duction in  the  sum  of  LS,  and  so  forth.  In  this  way  variables  are  added 
until  some  rule  stops  the  process.  The  procedure  is  computationally 
very  feasible  and  fast.  However,  it  is  easy  to  show  it  is  not  always 
optimal.  The  step-up  procedure  has  recently  been  described  without 
much  critical  analysis  in  papers  by  ANDERSON  and  FRUCHTJCR,  and  SCHULTZ 
and  GOGGANS. 


The  aims  of  the  present  paper  are:  To  illuminate  the  method  of  LS 

in  linear  regression  analysis  with  geometrical  arguments,  giving  clear 
interpretation  of  certain  measures  of  estimation  efficiency;  thus  to  lead 
into  a natural  development  of  the  step-up  procedure  where  its  weakness 
as  well  as  its  intuitive  appeal  are  exposed;  to  examine  the  geometrical 
structure  for  a procedure  for  elimination  of  a variable  previously 
selected,  and  thus  mitigate  the  flaws  in  the  step-up  procedure;  to 
explore  the  statistical  model  for  reasonable  decision  rules  on  when  to 
eliminate  and  when  to  keep  adding  variables;  and  finally  to  provide  a 
translation  of  the  various  geometrically  conceived  procedures  to  comput- 
able algorithms. 


2.  Geometric  formulation  of  the  principle  of  least  squares.  The 

A ^ 

notion  of  obtaining  an  estimate,  Y = b^Z  , out  of  the  admissible  class 
which  minimizes  the  sum  of  squares  of  deviations,  is  one  admitting 
of  accessible  and  correct  geometrical  descriptions.  Such  a formulation 
is  helpful  in  understanding  the  step-up  procedures  for  selecting  significant 
estimation  variables  (to  be  described  in  the  next  section)  and  seems  to 
hold  the  only  hope  of  devising  techniques  even  more  defensible  than  the 
step-up  procedure.  We  proceed  now  toward  such  a formulation. 

Assumed  available  are  the  N observation  vectors,  (y  ,z  z ), 

p = 1,2, ...,N,  where  z . = Z.  (x  x as  indicated  in  the  pre- 

ceding  section.  Associated  with  each  of  the  p estimation  variables 
Z^,  i = 1, 2,  ...,p,  is  the  vector,  lying  in  the  euclidean  N-space  E1^, 
consisting  of  N values  z .,  p = 1,2, . ..,N,  observed  on  that  variable. 

pi 

We  shall  call  these  vectors  estimation  vectors;  we  write  them,  z^(i  = 1,2, ...,p) 

and  for  matrix  manipulations  they  will  be  thought  of  as  column  vectors. 

Hence,  using  the  letter  T to  indicate  matrix  transpose,  zT  = (z^,  z2±>  * * * > * 

In  this  section  the  N x p matrix  of  these  estimation  vectors  will  be 

denoted  as  z.  Similarly,  the  symbol  y represents  the  vector  of  the 

observed  values  of  the  dependent  variable  Y.  It  will  be  assumed, 

without  any  real  loss  of  generality,  that  N > p and  that  the  estimation 

vectors  are  linearly  independent.  Thus  the  estimation  vectors  consitute 

N 

a basis  of  a p -dimensional  vector  space  V , lying  in  E . 

P 
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Consider  now  the  sum  of  squares  criterion.  Writing  the  parameter 
vector  as  a,  this  criterion  is 

«<•>  ’ = % - r/  - A, 

n=l  ^ p 

where  d = y - y is  the  vector  of  deviations.  Note  that  y = S?  a.z  lies 

1 1 1 m 

in  the  vector  space  V generated  by  the  estimation  vectors  and  that  d d 
is  the  square  of  the  (euclidean)  distance  between  y and  y.  Since  the 
aim  was  to  determine  b such  that  g(b)  = min  {g(a)|a|,  the  least  squares 
problem  may  be  interpreted  as  finding  a vector  in  the  space  spanned 
by  the  estimation  vectors  which  lies  nearest  the  dependent -variable 
vector  y. 

Geometrical  intuition  now  supplies  the  correct  solution  to  the  least 
squares  problem;  viz.,  the  vector  in  V lying  nearest  y is  the  projection 
of  y onto  V . Other  important  points  are  indicated  by  the  geometry. 
Writing  y as  the  projection  of  y onto  V , e = y - y,  and  e2  = eTe,  etc., 
Pythagorean  relations  are  indicated.  E.g.,  y2  = f 2 + e2;  i.e.,  the  square 
of  the  length  of  the  dependent -variable  vector  equals  the  sum  of  the 
squares  of  the  lengths  of  the  best  estimate  vector  and  the  least 
squares  residual  error  vector.  This  is  often  stated  as,  "The  total 
sum  of  squares  equals  the  sum  of  squares  due  to  regression  (estimation) 
plus  the  sum  of  squares  due  to  error."  Also,  if  y = Ea.z.  is  another 

O O A 1 1. 

vector  lying  in  V , if  d = y - y,  then  d = e + (y  - y)2  . Also,  the 
e. vector  will  be  orthogonal  to  V . Finally,  the  angle  between  y and 
its  projection  should  be  less  than  the  angle  between  y and  any  other 
vector  in  V . Thus  cos  0(y,y)  > cos  0(y,y),  where  0(u,v)  . means  the  angle 
between  vectors  u and  v. 

In  statistical  terminology  the  cosine  of  the  angle  between  two  such 
vectors  is  called  a correlation  coefficient.  Recall  that 

cos  0(u,v)  = — — 

■J  ~ 

u v 


2u.V. 

1 1 

yEu2Zv2 
1 1 
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In  the  above  instance  cos  0(y,y)  = R is  called  the  multiple  correlation 

coefficient  between  y and  y.  Note  that  this  should  be  unity  if  y does 

indeed  lie  in  V , and  should  reduce  progressively  to  zero  when  the  esti- 
P 

mation  space  is  less  and  less  effective.  Thus  R provides  a rather  use- 
ful and  suggestive  index  of  the  efficiency  of  the  estimation  space.  The 

T 

square  of  the  length  of  the  least  squares  residual  error  vector,  e e,  is 
another  closely  related  measure  of  the  efficiency  of  estimation. 

The  situation  is  represented  schematically  in  the  following  diagram 


t 


The  foregoing  geometrical  discussion  can  he  substantiated  with  a 
detailed  algebraic  development.  Such  substantiation  is  a consequence 
of  the  argument  to  follow,  but  the  primary  purpose  of  the  argument  is 


to  make  the  geometrical  entities  explicit,  to  make  essential  quantities 
computable  and  to  set  the  stage  for  the  next  section. 

The  estimation  space  is  spanned  by  sets  of  orthogonal  vectors 


of  unit  length. 


Let  z*,  z*, 


. , z*  be  one  such  set.  Since  every  vector 


in  Vp  is  a unique  linear  combination  of  the  estimation  vectors. 


zf  = 


qllZl  + 


q*i  z 
lP  P 


zl  - Vi + ••• + V 


i.e.,  z*  = zQ,  where  Q is  a non-singular  p x p matrix,  and,  of  course, 

z = z*Q  1.  Also,  every  vector  in  has  a unique  representation  either 

as  a linear  combination  of  z , ...,z  or  of  z*, ...,z*.  If  y lies  in  V , 

r p x x'  p 

then  there  exists  a unique  vector  a such  that  y = a^z^  = za,  and  there 

exists  a unique  vector  a*  such  that  y = z*a*.  But  z = z*  Q_1,  so  that 

a*  = Q ^a.  Thus  there  is  a one-one  correspondence  between  coefficient 

vectors  a for  the  z basis  and  vectors  a*  for  the  z*  basis.  In  particular, 

if  b*  is  such  that  y = z*b*  is  the  one  vector  in  V closest  to  y,  then 
A P 

y = zb,  where  b = Qb*. 

With'  these  orthogonal  vectors  z*  in  mind  an  orthogonal  trans- 
formation is  how  imposed  on  the  points  in  in  such  a way  that,  in 
the  transformed  space  , the  z*  become  the  unit  vectors  u^,  u^, ...,u  . 

Such  a transformation  is  accomplished  with  an  N x N orthogonal  matrix  P 

T 

whose  first  p rows  are  the  vectors  z*  . It  is  easily  seen  that  distances 
and  angles  are  preserved  under  such  a transformation,  so  that  the  least 
squares  problem  is  invariant  under  the  transformation.  Note  that  the 

I 

image  of  is  simply  the  linear  combinations  of  the  unit  vectors. 
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u1,  Let  y'  = Py,  and  let  y = z*a*  lie  in  V^,  so  that  y'=  E^  a*  u^. 

Then  the  square  of  the  error  vector  is 

dTd  = d'Td'  = (y*  - Ea*u±)T  (y*  - Ea*^)  = E^y!  - a*)2  + E^y^  • 

Evidently  the  projection  of  y'  onto  ought  to  be  the  vector  whose  first 
p components  are  those  of  y'  and  whose  remaining  components  are  zero. 

Thus  the  a*  which  produce  the  combination  of  (i  = 1, 2, ...,p)  con- 
stituting the  projection  of  y'  on  are  y^.  In  short,  b*  = y£,  i = 1,2, ...,p. 
That  this  is  correct  algebraically  can  be  seen  in  the  preceding  equation, 
where  it  is  obvious  that  these  are  the  values  of  a*  which  minimize  the 
square  of  the  error  vector.  Write  y'  = b*u^  = [y^>  • • • • • • ,0]  * 

Note  that  the  residual  error  vector  [0, . . . ,0,y^+^  y^]T  = e'  so 

e'  and  y'  are  orthogonal.  Note  also  that  E-^  (y^  - a*)2  = (y*  - y')2  and 
hence,  from  the  foregoing  equation,  that 

a'2  = (y'-y')2  + (y'-$')2  = (y'-y')2  + e'2. 


Having  seen  now  that,  relative  to  an  orthogonal  basis  of  V , 
b*  = z*Ty  (which  follows  from  the  fact  that  M = y^  and  y^  = z**y  for 
i = 1, 2,  ...,p),  it  is  now  desirable  to  obtain  y and  eTe  in  terms  of  the 
original  estimation  vectors  and  the  dependent  variable  vector.  But 
y = z*b*  = zb,  where  b = Qb*  = Qz*Ty  = QQTzTy.  Now 


(Qf Q1)"1  = Q"1  Q_1  = Q"1  z*Tz*Q_1  = (z*Q"1)T  (z*Q-1) 


T 

z z . 


mm 

Thus,  writing  h = z z and  g = z y,  in  terms  of  original  data,  b = h g. 
Also 


2 -n?  P T T TTT  , T nT 

e = Ej  y[  = b*  = y z*z*  y = y zQQ  z y = (z  y)  b = Eb^. 
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Thus  computationally  the  problem  is  one  of  solving  the  system  of 

equations  hb  = g.  In  the  succeeding  discussion  it  will  be  important  to 

remember  the  following  principle  which  summarizes  much  of  the  preceding 

development  and  unifies  the  geometry  and  algebra  of  the  least  squares 

problem:  Given  a set  of  k linearly  independent  vectors  zn,...,z  in 

lk 

an  euclidean  space  and  a (k+l)-st  vector  w,  if  h = zTz  where  z = (zn,...,z 

Tjl  k 

ctaa  v = z w;  then  the  solution  x of  the  equations  hx  = v is  such  that 
zx  is  the  projection  of  w onto  the  space  generated  by  the  z and  the 
solution  effectively  resolves  the  w vector  into  its  projection  zx 
and  a component,  e = w - zx,  orthogonal  to  the  projection. 

3-  The  Step-up  Procedure.  In  this  section  emphasis  is  shifted 
to  the  selection  of  a subset  of  (say)  k estimation  vectors  out  of  a total 
number  of  (say)  p.  An  optimal  set  of  k,  by  definition,  will  be  that 
set  of  k corresponding  to  which  the  length  of  the  error  vector  is  least 
(or  equivalently  the  multiple  correlation  coefficient  R is  most).  The 
plausibility  of  the  step-up  procedure,  as  well  as  its  deficiencies,  will 
be  seen  from  the  geometrical  development.  Computational  feasibility 
and  procedures  will  be  evident  from  the  corresponding  algebra. 

For  the  moment  we  suppose  that  k-1  vectors  have  been  chosen  and 
that  our  purpose  is  to  add  another  one  from  the  p-(k-l)  remaining.  We 
shall  refer  to  estimation  vectors  selected  as  being  in  the  active  esti- 
mation space  or  as  being  active. 

With  regard  to  a least  square  problem  involving  y and  the  k-1 
active  estimation  vectors  (which  of  course  are  a basis  for  a vector 
space  of  dimensionality  k-l)  everything  in  the  preceding  section  is 

directly  applicable.  This  succession  of  problems  with  1,  2,  . . . ,k,  . . . ,p 
vectors  in  the  active  estimation  space  is  sometimes  called  the  succession 
of  the  1st,  2nd,...,  kth,  ...,  pth  fittings.  We  shall  frequently  use  a 
superscript  to  indicate  the  fitting,  or  dimension  of  the  active  estimation 
space.  This  notation  does  not  specify  which  of  the  vectors  are  in  the 
active  estimation  space,  but  we  shall  tacitly  assume  they  have  been  re- 
labeled so  that  the  active  estimation  vectors  are  now  z., , z~. ...,z 

1 2'  9 k-1 


According  to  the  preceding  section  y^-^  _ t>  (^-l)  z = z^k-1^  Jk_^ 

i=l  1 i (k 

where  b^k  1 is  the  solution  to  the  system  of  equations,  h^k-1^  b^k_1^  = s 


with  h 


(k-1)  = Z(k-1)T  z(k-l)^  g(k-l)  = Jk-1) 


y,  and  z 


(k-1)  _ 


= (zlt 


Recall  that  y^k  ^ is  the  projection  of  y onto  Vk  ^ and  that  the  residual 


c-1^ 


error 


vector-  e^k  has  length  whose  square  is  (b^k_1^ .g^k-1^) . 


Suppose  next  that  the  kth  vector  to  become  active  has  been  selected. 

Consider  the  system  of  equations  h^k-1^  x^k_1^  = v^k-1^,  where  v^k-1^= 
„(k-l)r 


V 


.k-l  (k-1) 
Ji=l  xi 
.k-1  (k-1) 


z,  . Recall  that  E.‘  ~ x_.v“  * ' z ^ is  the  projection  of  z^  onto 


^k-l;  an(i  zk  ~ Zk  " ^i=l  xi  ***  is  *the  component  of  z^.  lying  orthogonal 

to  the  space  spanned  by  the  z^  . . . , z^.  The  vectors  z^  = z±9  z^,  . . . , z£,  . . 
thus  defined  are  a particular  determination  of  Gram-Schmidt  orthogonal 
vectors . In  matrix  form  the  matrix  of  the  first  k of  these  Gram-Schmidt 
vectors  is 


z.(k)  = z (k)  qi (k)^  where  q . (k)  _ 


1 - x. 


(1) 


(2) 


(2) 


(k-1) 


2 

1 


Jk-1) 

*k-l 


from  the  equation  above. 

Normalized  Gram-Schmidt  vectors  are  obtained  when  the  columns  of 

(k)  . J, 

are  divided  by  (z^  • z^)2.  Thus  orthonormal  Gram-Schmidt  vectors  are 
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Z*(k)  = z(k^  Q(k), 

(k) 

where  Q is  upper  triangular  with  the  reciprocals  of  the  lengths  of 
the  Gram-Schmidt  vectors  in  the  diagonal. 

Recapitulating  at  this  point,  we  have  an  orthonormal  basis  for  the 
active  estimation  space  in  terms  of  the  Gram-Schmidt  orthogonal  vectors, 
where  the  last  Gram-Schmidt  vector  was  the  component  of  the  last  esti- 
mation vector  selected,  orthogonal  to  the  space  of  the  others. 

It  is  interesting  to  note  that  the  lengths  of  the  Gram-Schmidt 
vectors  z^  are  readily  available  from  the  original  estimation  vectors. 

In  fact,  using  the  basis  z*, ... ,z*  derived  from  the  Gram-Schmidt  vectors 
as  the  orthonormal  basis  of  the  previous  section,  it  follows  from  the 
results  of  that  section  that  z*^  = z^  Q^k^,  where  is  triangular 

with  (z£  • z£)  2 ■ and  that  h^k^  = Q^k^  , or  writing 

a(k)  _ h(k)-1,  that  att(l°  = q te2  - (z£  • z')-1. 

Now,  given  orthonormal  vectors,  z*, ...,z  *,  z*,  from  the  preceding 

section  the  square  of  the  projection  of  y onto  ^ was  2^”^  b/*2,  where 

W.0'-D  . s*0=-i)T  y; 

while  the  square  of  the  projection  onto  V,  is  b*2,  where 

K 1— _L.  1 

, „(k)  Jk)T 
b*'  ' = z*  y. 

2 

Thus,  b*  is  the  increase  in  the  square  of  the  projection  vector  obtained 
by  activating  the  estimation  vector  z^  (whose  component  orthogonal  to 
Vk-1  is  z^);  or,  equivalently,  b*  is  the  reduction  in  the  square  of  the 
residual  error  vector  obtained  by  activating  z^. 
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Now  the  principle  of  the  step-up  procedure  becomes  clear.  Given  the 

problem  of  augmenting  by  one  vector  an  active  estimation  set  of  k-1, 

the  answer  is  to  choose  that  one  for  which  the  new  projection  of  y in 

has  the  largest  component  orthogonal  to  the  old  projection  in  i.e., 

choose  so  that  relative  to  the  augmented  Gram-Schmidt  orthonormal 

2 

system,  z*, ...,z*  1,z*,b*  is  maximum. 

2 

Again,  it  is  important  to  be  able  to  examine  what  values  b*  could 
have  for  the  various  possible  vectors  which  could  be  chosen  as  z^,  and 
to  do  this  easily  in  terms  of  the  original  vectors.  But  recall  that 


z*(k)  _ z(k)  Q(k)^  q 

so  that  the  triangularity  of  Q 


00 

00 


= b^  = h^  g^, 
implies  that 


b (k) 

. v , (k)  , „2  k 

‘ikk  bk  = \ > or  K TkT 


kk 


It  is  worth  noting  that  the  residual  error  vector  can  be  con- 
sidered as  a final  Gram-Schmidt  vector,  since  e^  = y - y^,  where 
y^k^  is  the  projection  of  y onto  V,  . But  we  have  seen  that  the  reciprocal 

.K 

of  the  square  of  the  kth  Gram-Schmidt  vector  is  the  last  diagonal  element 

(k)  (k) 

of  the  inverse  of  hv  . Thus,  if  the  h matrix  being  used  is  augmented 

(kf 

with  an  additional  column  z y and  a symmetric  row,  corresponding  to 
the  dependent-variable  vector  y1,  then  the  last  diagonal  element  of  the 
inverse  of  this  augmented  matrix  will  be  the  reciprocal  of  the  sum  of  least 


squares. 


Wnul 


sequence  of  gauss ian  elimination  ta 


1 

0 

• 

0 ...  0 

• • 

• • 

• • 

Jk-i) 

*l,k 

• 

• 

• 

• • 

• • 

• • 

0 

0 
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Jk-l) 

\-l,k 

0 

....  0 

Jk-l) 

\k 
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• 

• 

0 

• 

• 
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• 

• 

• 

0 

...  0 

• • • • • * 

Note  that 

h (k-1) 
lk 

• 

• 

is  the  so] 

Jk-l) 

\-l,k 

Lution  of 


(k-l)  x (k-l)  = v(k-l) 


0 ...  01 


from 
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which  the  kth  Gram-Schmidt  vector  is  obtainable.  Note  that 


(k-l) 


(k-l) 


the  solution  of  h 


(k-i)  (k-i)  _ (k-i) 


Note  that  if  z.  is  to  be  the 


f k i (k^  fk^ 

next  vector  activated,  then  to  obtain  solutions  to  hv  ' x v 1 = vv  ' 

and  h^^  - g^)^  and  to  obtain  a^^  = h^^  , requires  only  to 

operate  on  the  above  matrix  with  elementary  (row)  transformations  so  as 
to  reduce  the  kth  column  to  the  unit  vector  u,  . This  will  produce 


(k-l) 


and  a. 


Thus  V2  - / hj*-1’. 


From  the  last  equation  it  is  east  to  see  that,  to  find  the  y;ecotr 
yielding  maximum  b *2,  one  need  only  examine  the  ratios  (g . ^k-1^  ) /h, 

•K  1 .1 .1 


for  j = k,  k+1, 


Note  finally  that,  after  k vectors  have  been  chosen,  the  last  diagonal 

element  of  the  inverse  of  the  augmented  matrix  would  be  l/G^kh  Hence 
G(k)  = e(k)2^ 

the  sum  of  squares  of  residual  error. 

Attention  is  called  to  the  obvious  fact  that  the  step-up  procedure  of 
activating  estimation  vectors  in  the  order  of  the  further  reduction  made 
to  sum  of  squares  of  error  is  not  necessarily  optimal  in  selecting  say  k 
vectors  out  of  p.  E.g.  the  y vector  could  be  practically  in  the  the  space 
of  two  vectors,  z±  and  z^  but  lying  closer  to  a third  z^  (not  in  the  space) 
than  to  either  of  the  given  two.  Thus  the  first  vector  selected  would  be 
vector  Zy  Then  regardless  of  which  one  was  selected  next,  the  pair  chosen 
would  be  inferior  to  z^,  z^. 

One  other  word  of  caution  is  in  order.  The  criterion  for  activating 
the  next  estimation  vector  is  a maximum  ratio.  The  denominator  of  this 
ratio  is  the  square  of  the  length  of  the  component  of  the  new  vector  in 
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the  direction  orthogonal  to  the  then  current  estimation  space.  Of  course, 
if  some  of  the  remaining  vectors  lie  in  the  currently  active  estimation 
space  (i.e.,  they  are  linearly  dependent  on  vectors  already  chosen) 
they  should  not  be  considered  as  candidates.  Because  of  roundoff  errors 
such  dependency  must  be  defined  approximately.  Note  that  an  almost 
dependent  vector  will  produce  a small  orthogonal  component  which  will  tend 
to  produce  a large  criterion  ratio  (which  may  be  primarily  an  accident 
of  roundoff  error) . To  avoid  spurious  selections  caused  in  this  way  the 
criterion  should  be  compared  only  for  those  vectors  whose  orthogonal 
component  exceeds  a minimum  value.  What  minimum  value  ought  to  be 
chosen  is  at  this  time  a matter  for  conjecture. 

4.  Criterion  for  eliminating  insignificant  variables.  From  the 
discussion  in  the  preceding  section  it  evidently  may  happen  that,  in  trying 
to  activate  an  efficient  set  of  k estimation  vectors,  the  step-up  pro- 
cedure will  select  at  one  stage  a vector  which  later  on  would  be  more 
efficiently  eliminated.  So  far  no  procedure  for  deactivating  any  of  the 
active  estimation  vectors  has  been  incorporated.  However,  the  algebraic 
technique  for  eliminating  any  designated  active  estimation  vector  and 
obtaining  the  regression  analysis  for  the  reduced  set  is  well-known. 

It  is  a question  of  deciding  whether  to  eliminate  one  and  if  so  which 
one  to  eliminate.  The  purpose  of  this  section  is  to  provide  a geometri- 
cally appealing  and  obvious  answer  to  the  second  aspect  of  this  question. 
Criteria  for  deciding  whether  to  eliminate  a variable  will  be  discussed 
in  the  next  section. 

Therefore  we  suppose  k estimation  vectors  have  been  activated  and 
the  corresponding  analysis  laid  out,  say  in  the  manner  of  the  sequence 
of  gaussian  tableaux  referred  to  in  the  last  section,  and  we  suppose  the 
decision  has  been  made  to  eliminate  one  of  the  vectors.  The  question  is: 
Which  one  shall  we  eliminate?  fix  attention  on  one  of  the  active  z^, 


A(k) 

say  for  definiteness  the  last  one,  z^.  Now  the  projection  yv  of  y 
onto  Vk  can  he  resolved  into  its  projection  y^-1^  onto  the  space 

spanned  hy  z]_> ' • and  a comPonen't  orthogonal  to  y^k_1^.  The  pro- 

jection y(k_1)  of  y^  onto  ^ is  indeed  the  saune  as  the  direct 
projection  of  y onto  so  that  the  orthogonal  component  mentioned 

above  in  the  resolution  of  J(k)  is  the  net  effect  of  the  active  vector 
z^  in  the  estimation  of  y with  yv  . Still  keeping  attention  to 

we  have  already  seen  that  the  square  of  the  length  of  this  orthogonal 
2 

■Jr  Xr 

component  is  b,  , where  in  fact  b is  a component  in  the  direction 

JC  K 

of  the  kth  Gram-Schmidt  vector  generated  according  to  the  order  in  which 


the  were  selected.  Also, 


4k)' 

k 

JkT 

kk 


where,  it  will  be  recalled, 


(jUj)  _ JJ) 


h ° b 


for  any  j * 1,2,. ..,p;  with  * z^^  z^\  z^  = (z1,...,z.), 

a<J>  - h'J)-1. 

Recall  also  the  pythagorean  relation  for  each  .]  = 1,2,. ..,p, 

(y.y).y=.^)2+e'j)2, 


where 


2 3 ,2  (,%2  H p 

= E b and  eu'  = E y'*  , 
i=l  UL=j+l  ^ 
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with  y'  = Py,  the  image  of  y under  orthogonal  transformation*  Thus, 

■X- 

remembering  that  b^  = y^, 

2 *(k-l)2  *2  (k)2 

y = y + b.  + e'  • 


Evidently  b,  can  be  interpreted  as  the  net  reduction  in  the  square 

iC 

of  the  error  vector  obtained  by  activating  zfc,  or,  equally  as  well, 
as  the  net  increment  (provided  by  activating  z^)  in  the  square  of  the 
active  estimate. 

Imagine  now  that  the  gauss ian  elimination  has  proceeded  to  the  point 
of  obtaining  a solution  to  h^  b^  = g^  with 


r 


But  now  suppose  j < k and  the  order  in  which  z . and  z 

J & 


have  been  introduced 
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is  reversed.  Imagine  re-scheduling  the  calculations  in  the  gaussian 
elimination  for  this  revision.  In  the  tableaux  this  would  he  accomplished 
if  in  the  initial  tableau  the  jth  and  kth  rows  were  interchanged  and  the 
jth  and  kth 'columns  (to  restore  the  initial  unit  matrix  on  the  right 
the  (p+l+j)-th.and  the  (p+l+k) -th  columns  would  also  have  to  be  inter- 
changed), and  thereafter  repeating  the  operations  which  produced  the 

(k) 

kth  tableau  laid  out  above.  The  solution  vector  bv  7 in  this  case  would 

(k)  (k) 

be  the  same  as  before  except  that  the  order  of  b^  7 and  b^  would  be 
interchanged.  Moreover,  the  inverse  matrix  would  be  the  same  except 
that  the  jth  and  kth  rows  and  the  jth  and  kth  columns  would  be  switched, 

putting  in  the  (k,k) -position  and  a ^ in  the  (j,j) -position. 

JJ  o o 

/,  /,  \ 

Note  now  that  b ' 7 /a  7 plays  the  role  of  b,  , and  hence  the  quantity 
. (k)  / (k) 

/a j j is  the  net  reduction  in  the  square  of  the  error  vector  due 

to  the  z.  vector. 

J 

Now  it  is  clear  which  of  the  k active  estimation  vectors  should  be 
eliminated,  viz.  that  z ^ (j  < k)  for  which  b /a^^)  -*-s 
Observe  that  these  ratios  are  computable  from  the  kth  gaussian  tableau 
set  out  above  without  any  re-computations. 

Having  decided  which  estimation  vector  is  to  be  eliminated  from 
the  active  set  of  k,  the  procedure  for  making  the  elimination  and  obtaining 
the  regression  analysis  for  the  reduced  set  of  k-1  active  estimation 
vectors  is  as  follows.  According  to  the  foregoing  remarks  no  generatlity 


will  be  lost  if  we  assume  that  the  vector  to  be  eliminated  is  z^.  But 

recall  that  to  add  z,  to  the  active  set,  zn , . . . z,  , , and  to  obtain  the 

k 1 k-17 

regression  analysis  for  the  augmented  set  it  was  only  necessary  to  perform 
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on  the  (k-l)-st  tableau  those  elementary  row  transformations  which  reduce 

the  kth  column  to  the  unit  vector  u,  • Therefore,  to  eliminate  z,  it 

k k 

is  only  necessary  to  undo  these  calculations.  It  is  not  hard  to  verify 
that  the  reversing  calculations  are  those  elementary  row  transformation 
(on  the  kth  tableau)  which  reduce  the  kth  column  of  the  inverse  a^)  back 

to  v 


It  is  of  course  only  a notational  convenience  to  assume  that  the 
estimation  vectors  activated  are  the  first  k of  the  p listed  in  the 
tableaux.  The  swapping  of  rows  and  columns,  while  tidying  up  the 
written  portrayal  of  the  tableaux,  etc.,  is  completely  unnecessary  for 
computer  handling  of  the  problem. 

Finally  we  shall  mention  that  the  rule  described  above  for  deciding 
which  vector  to  eliminate  is  equivalent  to  that  of  eliminating  the  active 
vector  that  has  the  smallest  partial  correlation  with  the  dependent 
variable  vector.  The  partial  correlation  coefficient  between  z, 

(say)  and  y is  the  cosine  of  the  angle  between  ^ and  y^^A^k-l)  ^ 
From  the  sketch  below  it  is  clear  that  this  correlation  decreases 
the  length  | |y^  ^ - y^  ^ | | = |b  | decreases: 


as 


* 
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(k)  A(k) 

From  the  definition  of  cosine  between  ev  ' and  yv 


.s(k-l)  ..  . 

yv  it  is  easy 


to  show  that 


cos 


2 e(e(k-l),  $<k>  . if'11-1)). 


u2 


(e(k-D. 


(k-i)  (k)‘ 

o(k-D  . 

K. 


kk 


5.  Decision  rules:  the  statistical  model.  In  the  last  section 

the  question  answered  was  which  active  estimation  variable  ought  to  be 
eliminated  once  the  decision  had  been  made  to  eliminate  one.  The 
question  of  constructing  decision  rules  to  tell  when  to  eliminate  a 
variable  was  left  for  this  section.  Defining  a sweep  or  iteration  as 
a step  in  which  either  an  inactive  estimation  vector  is  activated  or 
an  active  one  is  deactivated,  an  obvious  type  of  decision  rule  is  the 


4 


9 
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following:  Activate  two  vectors  according  to  the  step-up  procedure,  then 

eliminate  one  by  the  method  described  in  the  preceding  section,  and 
continue  operating  under  this  rule  until  some  stopping  rule  (see  below) 
stops  the  entire  procedure.  It  is  conceivable  that  such  a rule  would 
have  utility  if  it  is  important  in  the  ultimate  application  to  have  no 
more  than  k vectors  while  the  cost  of  the  extra  sweeps  is  relatively 
unimportant . 

Of  course  if  of- k active  estimation  vectors  one  has  a partial 
correlation  with  the  dependent  variable  vector  of  practically  zero, 
it  would  seem  wise  to  eliminate  it.  This  suggests .another  quite  arbi- 
trary type  of  elimination  rule:  Of  the  k currently  active  estimation 
vectors  eliminate  the  one  of  lowest  partial  correlation  with  y if  said 
partial  correlation  is  less  than  some  level  a(k),  possibly  a function 
of  k. 

Another  decision  problem  must  be  dealt  with,  viz.  that  of  constructing 
a stopping  rule  to  stop  the  step-up  procedure  (with  or  without  modification 
to  allow  for  deletions).  Here  again,  certain  obvious  but  rather  arbi- 
trary rules  come  to  mind.  E.g.,  stop  when  k vectors  have  been  activated 
(actually  this  was  the  somewhat  naive  rule  used  to  motivate  the  section 
on  the  step-up  procedure).  It  seems  clear  that,  by  itself,  this  is  not 
a good  rule,  since  in  a particular  example  a satisfactory  estimate  may 
be  attainable  with  far  fewer  than  k vectors  (i.e.  the  multiple  correlation 
coefficient  may  be  already  very  near  one  with  fewer  vectors  or  simply 
may  not  be  improved  "significantly"  to  warrant  the  inclusion  of  more). 

We  take  the  position  at  the  present  time  of  recommending  a fairly 
comprehensive  battery  of  stopping  rules,  any  combination  of  which  might 
be  used,  with  a variety  of  sensitivity  settings  possible.  Intuition 
suggests  that  appropriate  settings  will  vary  with  the  type  of  problem, 
the  usage  requirements  and  the  burden  of  cost  in  time  and  money.  Per- 
haps a battery  of  stopping  rules  should  at  least  make  provision  for 
stopping  when  a fixed  number  of  estimation  vectors  have  been  activated. 
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when  the  estimate  is  of  sufficiently  high  accuracy  (multiple  correlation 
sufficiently  near  one),  when  the  number  of  sweeps  exceeds  a certain 
number  (this  acts  as  a safeguard  against  a cyclic  pattern  of  activation 
and  elimination  of  vectors),  and  when  the  last  r (say)  vectors  activ- 
ated have  not  produced  a "significant”  change  in  the  estimate. 

Again  the  word,  "significant",  requires  specific  interpretation 
before  the  rule  can  be  operational.  One  modus  operand!  might  be:  Stop 

the  procedure  if  the  increase  in  the  multiple  correlation  coefficient 
R,  produced  by  adding  the  last  r active  estimation  vectors,  was  less 
than  3(r,k). 

Both  in  the  question  of  whether  to  deactivate  an  active  estimation 
vector  and  in  the  question  of  when  to  stop  activating  estimation  vectors 
the  notion  of  significant  effect  arises.  This  suggests  the  possibility 
of  resorting  to  a statistical  model  where  the  techniques  of  testing  hy- 
potheses might  be  invoked  as  a basis  for  decisions  on  whether  to  eliminate 
a variable  or  whether  to  stop  the  activation  process. 

In  the  remainder  of  this  section  we  shall  sketch  the  outline  of  a 
statistical  model  perhaps  sufficiently  to  indicate  the  attractiveness  of 
such  a decision  mechanism  as  well  as  to  indicate  some  of  the  limitations 
of  such  a model. 

Very  briefly  the  model  develops  a statistic,  or  function  of  the 
observed  active  estimation  vectors  and  the  dependent  variable  vector, 
called  an  F statistic  which  is  the  decision-making  instrument --large 
F means  significance  of  the  effects  being  tested  and  small  F means 
nonsignificance.  Under  the  hypothesis  of  the  statistical  model,  and 
under  the  additional  hypothesis  that  the  effects  of  the  estimation 
vectors  being  tested  are  only  "noise"  effects  or  effects  introduced  by 
virtue  of  random  fluctuations,  the  F statistic  is  expected  to  have  a 
value  of  about  unity. 

Actually,  the  F statistic  is  a ratio  of  the  average  of  the  effects 
of  the  vectors  being  tested  to  the  average  of  some  random  error  effects. 

In  the  terminology  developed  in  previous  sections  suppose  that 
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Zk-r+l'  - zk  are  active  estimation  vectors  whose  combined  effect  is 

being  tested.  Recall  that  yv  is  the  projection  of  y on  the  space 

spanned  by  and  that  y^  ^ is  the  projection  of  as 

well  as  the  projection  of  y onto  the  subspace  spanned  by  z . 

-L  K-r 


In  the  F ratio  the  average  of  the  effects  of  the  r vectors  z z 

k-r+l-’  ’ k 

is  measured  as  - times  the  square  of  the  length  of  the  vector,  y^  - ^k“r) 

while  the  average  of  error  components  is  measured  as  — — — times  the 

square  of  the  so-called  error  vector,  e^^  (recall  that  e^^  lies  in  a 

space  of  N-k  dimensions  orthogonal  to  the  space  generated  hy  z-^,...,z^_ 

in  which  y ^ - y^k  lies).  Obviously,  values  of  the  F statistic  less 

than  one  would  not  tend  to  support  significant  effects  of  z z , 

K—r ' k 

while  values  greater  than  one  presumably  would.  With  the  normal  law  of 
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errors  assumed  in  the  statistical  model  and  under  the  hypothesis  that 


these  supposed  effects  of  the  last  r vectors  are  noise  effects,  it  turns 
out  that  the  chances  are  approximately  even  that  F should  exceed  the 
critical  value  of  unity.  If  the  critical  value  is  increased  the  pro- 
bability that  the  F statistic  will  exceed  it  diminishes  rapidly.  These 
probabilities  are  tabulated  for  various  critical  values  and  various 
degrees  of  freedom  (r  and  N-k  in  our  case).  One  may  establish  a decision 
rule  to  reject  the  hypothesis  of  no  systematic  effect  (from  the  estimation 
vectors  being  tested)  if  the  value  of  the  F statistic  observed  is  improb- 
ably larger  than  one. 

The  decision  rule  is  not  complete  until' specific  numbers  or  functions 
are  attached  to  the  words  ’’improbably  larger.”  Undoubtedly  a judicious 
choice  depends  on  several  factors  involved  in  the  balancing  of  cost  and 
return  in  a particular  problem.  This  Is  one  of  the  open  questions  we 
have  tried  to  study  experimentally  in  another  supporting  study. 

To  complete  the  exposition  some  description  of  the  characteristics 
of  the  assumed  statistical  model  is  warranted,  although  as  we  have  men- 
tioned there  are  recent  excellent  accounts  of  this  model. 

In  the  statistical  linear  regression  model  it  is  assumed  that, 
except  for  random  variations,  Y is  a linear  function  of  the  Z^.  Thus 


P 

y =1]  0.  z . + e , 

Ju  i=1  1 111  |1; 


|i  - 1> 2,  • . . jN > 


where  e are  random  errors.  In  addition  it  is  usually  assumed  that  the 

2 

e are  uncorrelated  with  a common  variance  a and  a mean  of  zero.  The 
kL 

0 are  parameters  which  may  be  estimated  In  an  optimal  way  under  the 


438 


circumstances.  In  fact,  the  test  linear  unbiased  estimate  of  a linear 

P 

combination  of  the  P.,  say  T|  = E p.  Z.,  best  in  the  sense  of  smallest 

i . n i 1' 

A 

variance,  is  Y - E b^  Z^,  where  the  b^  are  precisely  those  which  pro- 
duce the  least  squares  estimate.  This  is  the  Gauss-Markov  theorem. 

It  implies  that,  if  the  true  functional  relationship  is  except  for  a 
random  error  Y = T|  = E fk  Z^,  then,  faced  with  not  knowing  the  exact 
values  of  the  (ik , the  next  best  thing  is  to  use  the  estimation  function 
Y = Y = E b.  Z. . 

l l 

To  see  the  truth  of  this  theorem  we  shall  need  to  use  the  expected 
value  or  mean  value  operator  E operating  on  a random  variable  or 
vector  or  matrix,  with  the  expected  value  of  a matrix  of  random  variables 
being  the  matrix  of  expected  values.  From  this  definition  it  follows 
directly  that  E A X B = A(EX)B,  if  X is  a random  matrix  and  A and  B are 
nonrandom  matrices. 

T 

Now  under  the  statistical  model  above,  y = zp  + e,  where  y = 

T \ T 

(y-^ z = (z-l, ...,zp),  z±  = P = (P1#»-*,P  ), 

T 

e = ('e^,  • . .,e^),  with  e (and  hence  y)  being  random  vectors.  According 
to  the  assumptions,  Ee  = 0 so  that  Ey  = z(3;  and  the  e are  uncorrelated 

p. 

2 T 2 

with  a common  variance  a , so  that  Eee  = a I,  I being  an  identity 

matrix.  Note  that  the  z.  vectors  are  nonrandom. 

i 

First  we  show  that  Eb  = g,  i.e.,  that  the  b^  are  unbiased  estimates 
of  the  corresponding  p^.  In  fact 
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-1  ’ -V  ^ T '--1T- 


Eb  = Eh  g = h 
-1  T 

h 2 zP  = h 


-1T-  = 


z y = h z Ey  = 

IP  = P- 


Next  we  exhibit  the  covariance  matrix  of  the  estimates  b : 

E(b  - Eb)  (b  - Eb)T  = E(b  - P)  (b  - P)T  = 

E(h"1g  - Eh_1g)  (h_1g  - Eh-1g)T  = 
hh(g  - Eg)  (g-  Eg)T  h"1, 

since  h and  h ^ are  symmetric,  how 

E(g  - Eg)  (g  - Eg)T  = E(zTy  - EzTy)  (zTy  - EzTy)T  = 

rp  m mm  rp  O P 

z E(y  - Ey)  (y  - Ey)  z = z Eee  z = z a Iz  = on. 


Hence,  substituting  above. 


E(b  - Eb)  (b  - Eb)T  = h'^a^hh"1  = h_1a2. 

Now  consider  Y = Eb.  Z.  = ZTb  as  an  estimate  of  7|  = E (3.  Z.  = ZTp. 

11  l i 

Observe  that 


A 

Y 


T 

Zb 


(zVhb 


T 

a y. 


T T -1  T A 

where  a = Z h z . This  is  what  is  meant  by  saying  that  Y is  a linear 

estimate  of  7];  i.e.  it  is  a linear  combination  of  the  observed  values  of 

the  random  dependent  variable  Y. 


Also  EY  = EZ^b  = Z^Eb  = Z^P  = 7].  Hence  Y is  an  unbiased  estimate  of  71. 
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Finally  we  must  show  that  the  variance  of  Y is  less  than  that  of  any 

other  linear  unbiased  estimate  of  7].  Suppose  Y to  be  another  linear 

unbiased  estimate  of  7),  so  that  Y = + ...  + c^y^  = cTy,  and  EcTy  = 7], 

Wow  consider  vectors  in  euclidean  N-space.  Note  that  a = z(h-1Z),  a 

vector  lying  in  the  estimation  space  spanned  by  the  vectors  z^, . ..,z^.  We 

shall  see  that  the  vector  a is  the  projection  of  c onto  the  space  spanned 

by  . Since  EaTy  = EcTy,  then  0 = E(c-a)Ty  = (c-a)TEy  = (c-a)Tz3- 

T 

This  identity  can  hold  only  if  (c-a)  z = 0.  But  this  implies  that 

(c-a)Ta  = (c-a)T  z(h-1Z)  = 0. 

op  p 

Hence  a and  c-a  are  orthogonal,  and  the  pythagorean  relation,  c = a + (c-a)  , 
holds. 

The  variance  of  Y is 

e(y  - ey)2  = e(y  - ey)  (y  - ey)t 

rn  m m m rp 

= E(c  y - Ec  y)  (c  y - Ec  y) 

= c E(y  - Ey)  (y  - Ey)  c = c Eee  c 

= o2cTc  = a2-jaTa  + (c  - a)T  (c  - a)j-  > cr2 aTa. 

a 2 T 

But  of  course  by  the  same  reasoning  the  variance  of  Y is  a a a.  This 

shows  that  $ is  of  minimum  variance. 

To  arrive  at  the  F-statistic  test  for  our  decision  rule  in  eliminating 

an  estimation  vector,  or  in  stopping  the  activation  of  estimation  vectors, 

additional  assumptions  are  needed.  Suppose  that  k of  the  estimation  vectors, 

k ’ 

z^,  ...z^  has  been  activated,  and  it  happens  that  Y = E in  short 

that  the  statistical  model  is  valid  with  these  k variables,  so  that 
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y = S P.  z . + e or  y = z(k)g(k)  + e,  when  eT  = (e  ).  Suppose., 

p i=l  1 ^ ^ x iM 

T 2 

in  addition  to  the  conditions  that  Ee  = 0 and  Eee  = a I,  we  require  that 

the  be  normally  distributed.  Now  suppose  we  wish  to  test  the  hypothesis 

(H  ) that  the  last  r parameters  g , ...,g  are  in  fact  all  zero. 
o'  k— r+x  k 

(Accepting  this  hypothesis  implies  that  the  activation  of  the  last  r estima- 
tion variables  adds  nothing  to  the  estimate  available  with  the  first  k-r 
variables.) 

The  basic  idea  of  such  a test  is  to  divide  the  sample  space,  i.e. 
the  space  of  possible  values  of  the  vector  y,  into  a rejection  region  R 
and  its  complement,  an  acceptance  region,  the  ultimate  decision  rule 
being  to  reject  Hq  in  case  the  observed  value  of  y falls  in  R.  Naturally, 
in  order  to  make  the  test  a discriminating  or  powerful  one  the  points 
in  the  rejection  region  ought  to  be  chosen  roughly  so  as  to  maximize  the 
probability  of  rejection  when  Hq  is  not  true,  while  at  the  same  time  the 
probability  of  rejection  when  Hq  is  true  should  be  kept  below  a certain 
bound.  Such  a test  is  approximately  obtained  by  putting  in  R those  points 
with  highest  "trade-off  ratio, 11  this  ratio  being  essentially  the  ratio 
of  the  maximum  of  the  probability  density  functions  (pdf)  over  the  entire 
family  of  pdf's  defined  by  the  admissible  values  of  the  parameters,  to 
the  maximum  of  the  pdf’s  over  the  subfamily  where  the  hypothesis  Hq  holds. 
This  ratio  is  palled  the  likelihood  ratio  X.  Such  points  of  highest 
likelihood  ratio  are  placed  in  R until  the  set  is  as  large  as  it  can  be 
and  still  have  the  desired  bound  or  the  probability  of  rejection  when  Hq 
is  true . 
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The  optimal  character  of  the  likelihood  ratio  test  for  the  problem 
at  hand  is  given  excellent  treatment  in  SCHEFF^. 

Let  G stand  for  the  parameter  space  of  admissible  values  of  the 
parameters.  In  our  case 

Q = CT2|-“  < 0^)  < 00 ^ CT2  > oj 


Let  <jd  stand  for  the  subset  of  Q where  H is  true;  i.e. 

o 


0) 


= {P(k),  ^|-ro  < P(k'r)<  “>3k_r+1  = ...  = (3k=  0,  a2  > o). 


,(k-r) 


According  to  the  hypothesis  of  the  model  the  e are  normally  distributed, 

M* 

2 

uncorrelated  (and  hence  independent)  with  common  variance,  a . Thus 
the  joint  pdf  of  the  random  vector  y is  (for  a parameter  point  in  G). 

f(y;  °2)  - FT  (2ttct2)"2  exp  j-  S Pizui^2} 

n=l  L 2ct  V i=l  1 P'1  J 

= (^a2)^2  exp  {-  i (y  - z(kVk))T(y  - z(kVk))}- 

- 27 

Now  to  determine  R it  is  necessary  to  maximize  f over  G and  over  oo, 

form  the  ratio  X,  and  select  values  of  y for  which  this  is  highest. 

sup  f 

E ■ {ylx(5r)  * S5Tf  £ 

00 


where  X is  a critical  value  chosen  so  that 
a 

Pr  jyeR|HQ  is  true  |<  a; 

here  a is  called  the  significance  or  rejection  level  of  the  test. 
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We  recall  now  that  a sum  of  squares  of  m normal  indepdent  random 
variables  with  mean  zero  and  variance  one  (N(0,l))  is  said  to  he  a Chi- 
square  variable  with  m degrees  of  freedom.  The  ratio  of  the  average  of 
two  such  sums  of  squares  of  independent  N(0,l)  variables,  with  terms 
in  the  numerator  and  m^  in  the  denominator,  is  called  an  F variable 
with  and  m^  degrees  of  freedom.  The  probability  distribution  of  the 
F variable  is  widely  tabulated.  The  following  result  is  the  one  pertinent 
to  our  problem.  For  a statistical  linear  regression  model,  where  the 
errors  are  N(0,ct  ) independently  distributed,  the  rejection  region  R of 
significance  level  a,  provided  by  the  likelihood  ratio  criterion  for 
rejecting  Hq  as  described  above,  is  given  by 


{,i 


(i 


A(k)  /\(k-r) 


zjl 

/(N-k) 


Ul  > F(a) 

- r,N-kJ  > 


,(<*) 


where  F 1 . is  the  critical  value  in  the  F _T  . distribution  for  which 
r,N-k  r,N-k 

Pr  If  ,N-k  > F(ai  . j = O'. 

I r’  - r,N-k  J 

The  proof  of  this  important  theorem  is  obtained  by  constructing  the 
likelihood  ratio  X,  in  which  the  maximization  problems  are  observed  to 
be  essentially  the  least  squares  problem,  then  reducing  the  inequality 
X(y)  > X^  which  defines  the  rejection  set  to  the  form  given  in  the  con- 
clusion. Used  in  the  proof  are:  The  orthogonal  transform  of  y based  on 

the  Gram-Schmidt  vectors  z z*  , z*  , _ ..... zi  and  the  fact  that 

l'  k-r7  k-r+1*  ; k 

orthogonal  transforms  of  normal  vectors  are  normal.  Although  the  proof 
is  available  in  numerous  references,  we  sketch  it  here. 
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Lemma  1,  Let  y be  a vector  of  N(in^  a ),  independent,  random  variables,  and 

let  y'  = Py  be  an  orthogonal  transform  of  y.  Then  y'  is  a vector  of  N(m',o2), 

N 

independent,  random  variables,  with  m'  = L'  p m , where  P = (p  ).  Proof: 

|i  v=x  V 
T 

Write  m = (n^,  ...,11^),  and  let  G(§')  be  the  distribution  function  of  y'. 

Then 

G(§')  = Pr[y ' < §’]  = Pr[Py  < §']  = Pr[{y|Py  < §']] 

_ ^ (2no2)  N/2exp{ (y  - m)T  (y-m)}  . 

{y|py  < §'}  2a 


Now,  making  the  transformation  y'  = Py  in  the  integal,  the  Jacobian  of  the 

transformation  is  the  determinant  of  the  orthogonal  matrix  P,  hence  in 

absolute  value  is  one;  the  domain  of  integration  is  transformed  into 

|y'  | y * < ; and  the  integrand  becomes  (2ttct2)~N/2  exp-  | (y'-  Pm)T  (y1 

1 2a 

Hence 

G(§')  = TT  ^ (2 no2)"1/2  exp  { 1-  (y>  - m')2}  dy', 

|a=l  L 2o  ^ ^ u 

o 

so  that  obviously  the  y'  are  N(m',  o ),  independent. 

M’ 

o 

It  is  a corollary  of  lemma  1 that,  if  e is  a vector  of  N(0,o  ), 

independent  variables  and  e 1 = Pe,  P orthogonal,  then  s'  is  a vector  of 
2 

N(0,o  ) independent  variables. 

Lemma  2 . Let  y = z^  ^ 0^  ^ + e be  a statistical  linear  regression  model. 

(k) 

Let  z*  be  the  matrix  of  orthonormal  vectors  generated  from  z , ...,z  by 

1 k 

the  Gram-Schmidt  process,  so  that  z*^  = z^Q^  where  is  upper 

triangular.  Let  0*^  = Q^k)  B^kb  Then  6 

k-r+1 


= 0^  = 0 if  and  only 


if  0*_ 


•r+1 


= ...  = a*  = 0. 
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Proof:  Suppose  8k  r+1  = ...  = ^ = 0;  it  follows  from  the  equation 

= ^k)-1p  and  the  fact  that  is  upper  triangular  that 

B*  = 0.  then  8*  , =0,  etc.,  until  8*  = 0.  The  converse  argument 

y k-1  K-r+l 

is  the  same. 

Proof  of  the  main  theorem:  By  Lemma  2 


(i)  ={p^k^,  o2|  -»  < 8*^"^  < co,  P*_r+1  = ...  = 8*  = 0,  a2  > 0 

(k)  (k) 

and  of  course,  since  y = zv  pv  + e and 

,(k)„(^ . „<k)Q(k)Q(k)-ie(k) . z(k)p<k);  then  y , z»(k)p*(k) + s. 


'p*' 

Now 


8' 


Hy)  = 


suPlrf(y;p(k),c72)|(p(k),a2)en 


2 -N/2 

where  f = (2na  ) exp 
with  e = y - 


supjf(y;8^o2)  |(8^,  a2) 


euo 


r i t 

1 o e e 

2a 


Clearly  the  extreuizations  in  "both  cases  can  1 


ne 


T ■** 

obtained  by  first  minimizing  e e with  respect  to  the  substituting  tnese 

2 

back  in,  and  maximizing  the  resulting  expressions  with  respect  to  a . 


But  minimizing  e e is  precisely  the  LS  problem  encountered  before, 
rthog 

,oo- 


Using  (as  before)  the  orthogonal  transform,  y'  = Py  and  e1  = Pe  where  the 
first  k rows  of  P are  z*x 


T 

e e = e 


N 


Ce'  = (y'-8*)2  + ...  + (y'  - 8ft)2  + 

11  k k p=k+l  ^ 


,2 


Obviously  e e over  Q is  minimized  when  f3*  = y!,  i = 1,  ...,k  with  the  value 

T N 2 (k)2  1 \ 

of  e e reducing  to  r.  y'  = e ; while  e e is  minimized  on  u>  when 

p=k+l  ^ 

B*  = y',  i = 1, . ..,k-r  (recall  that  8*_r+1  = . . . = 8£  = 0 in  this  case), 


with  the  value  of  e e reducing  to 
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in  this  case. 


E y'2  = (£(k)  -yk“r))2  + e(k)2 
|i,=k-r+l  ^ 


Substituting  these  extreme  values  back  in  and  maximizing  the  numerator 

2 

and  denominator  with  respect  to  a , gives  for  the  numerator  a 

2 2 

and  for  the  denominator  0 A(k)  A(k-r)  (k) 

a2  = y - y + ew 

N 


N 


Replacing  these  in  the  expression  for  x(y)  we  get 


x(y)  = 


r a2-jN/2 


§2- 


i + 


(A(k)  _ A(k-r)}2  -jB/2 

M2  J * 


Now 


R = {y|^(y)  > = {y|_ 


i + 


A(k)  _ A(k-r) 




2 N/2 


.00* 


- V 


= {y| 


(y(k)  - $(k~r)) A 


(x2/w  - 1)  (N-k) 


(k)‘ 


/(N-k) 


}• 


Finally , since  (y^  - y^k  r^)2  = S y.'2  and  e^  = I]  y’2, 

i=k-r+l  1 [i=k+l  ^ 

since  by  Lemma  1 y'  = Py,  a vector  of  normal  independent  variables  with 
2 

common  variance  a , and  since  under  the  hypothesis  Hq  Ey^  = 0 (i=k-r+l,  . . .,k)^ 


then  the  ratio 


(y(k)  _ A(k-r)^2/r 
e^k^  /(N-k) 


E 

i=k-r+l 


(y l/a)2/i 


Z (y'/a)2/(N-k) 
H=k+1  * 
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is  a ratio  of  averages  of  sums  of  squares  of  h(0,l)  independent  random 
variables  when  Hq  is  true.  That  is,  the  likelihood  ratio  is  equivalent 
to  an  F statistic  when  H is  true. 


SELECTION  OF  SIGNIFICANT  ESTIMATION  VARIABLES 
IN  A LEAST  SQUARES  PROBLEM:  EMPIRICAL  COMPUTER  STUDIES 

The  object  of  these  studies  was  to  investigate  the  usefulness  of 
the  step-up  procedure  or  modifications  of  it,  in  choosing  a subset  of 
a large  number  of  estimation  variables  which  is  good  in  a least  squares 
sense.  In  the  first  phase  of  these  studies  we  wished  to  compare  the 
step-up  procedure  with  the  procedure  of  finding  the  best  subset  at 
each  stage.  Because  of  the  large  number  of  matrix  inversions  required 
in  the  last  method  we  could  handle  only  a very  small  number  of  terms . 

The  results  of  the  first  phase  are  summarized  in  the  two  examples 

which  follow.  In  the  first  run  we  note  that  the  step-up  procedure 

2 2 
gave  two  terms  with  R = 0.724  whereas  the  best  two  terms  give  R = 0.886. 

Phase  One  - Run  1 

In  this  run  the  dependent  variable  was 

p(X1,X2,X3)  = 3/(1+xJ+2X3)  . 


The  polynomial  model  was  a balanced  polynomial  linear  in  X^,Xg,  and 
X3,  i.e.j  a^X^+a^X^+a^X^+a^X^X^+a^X^X^+a^-K^-^+s.r^K^7^X^ . The  125 
data  points  were  in  a rectangular  design  with  X^  = .25( .25)  1.25,  X^  = 
.25(.25)  1.25,  and  X3  = .25( .25)  1.25.  As  will  be  noted  in  this  run, 
the  function  F is  actually  independent  of  and  hence  the  estimation 
variables  Z2'VZ6'Z7  should  not  enter  the  regression  equation. 
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Step-up 

Procedure 

Optimum  Set 

Est imation 
Variables 

R2 

Estimation 

Variables 

R2 

5 

,569815 

5 

.569815 

5,3 

. 724129 

3,1 

.885715 

5,3,1 

.957606 

3,1,5 

.957606 

5,3, 1,2 

.957615 

3,2, 1,5 

.957615 

5, 3, 1,2, 4 

.957631 

3, 2, 1,5, 4 _ 

.957631 

5, 3, 1,2, 4, 6 

.957632 

3,2,6, 1,5,4 

.957632 

5,3, 1,2, 4,6,7 

.957634 

3,2,6, 1,5, 4, 7 

.957634 

Note  that 

the  step-up  procedure 

did  not  select  the  optimum 

subset  of 

two  variables . 
Phase  One  - Run  2 


In  this  run  the  dependent  variable  and  the  polynomial  model  were  the 


same  as  in  Run  1.  The  500  data  points 

\ = .25( .25)2.50,  X2  = .25( .25)  2.50, 

Step-up  Procedure 

were  in  a rectangular  design  with 
and  X3  = .25( .25)  1.25. 

Optimal  Set 

Estimation 

2 

Estimation 

O 

Variables 

R2 

Variables 

R2 

1 

.702925 

i 

. 702925 

1,3 

.884762 

3,1 

.884762 

1,3,5 

.963786 

3,1,5 

.963786 

1,3, 5,2 

.963789 

3, 2, 1,5 

.963789 

1,3, 5, 2, 6 

.963791 

3,2,6, 1,5 

.963791 

1, 3, 5, 2, 6, 4 

.963791 

3,2,6, 1,5,4 

.963791 

1, 3, 5,2, 6,4,7 

.963791 

3,2, 6, 1,5, 4, 7 

.963791 

In  this  case, 

the 

step-up  procedure  gave  the  optimal  subset 

in  each 

case . 
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Conclusions  from  Phase  One 


These  runs  indicated  that  some  modification  (e.g.,  a throw-out  rule) 
might  be  helpful  in  obtaining  a regression  equation  which  would  be  close 
to  the  optimal.  To  investigate  every  possible  regression  equation  even 
from  a small  set  of  terms  is  so  time  consuming  that  we  did  not  use  any 
example  with  a large  number  of  terms  in  this  phase. 

Phase  Two 

In  this  phase  we  used  examples  with  a large  number  of  terms . We 
used  various  throw-out  criteria  to  investigate  the  relative  merits  of 
each.  We  did  not  find  the  optimal  subsets. 

Summary  of  Phase  Two 

In  the  first  12  runs  in  this  phase  we  used  a balanced  polynomial 
model  to  approximate  the  dependent  variables 


F fX  .X  .X  1 = ('xt-X^j.y2'*  lv  +v  Hv  I 
iv  2'  3'  v i “-2  A3'  I !+A2^3 1 


-1/2 


F2(X1,X2,X3)  = exp(-X^X2X3) 
F3(X1,X2,X3)  = /(X^+X2+X3) . 


The  results  of  these  runs  are  tabulated  below. 


/2  2 2 

In  the  case  of  the  ^7-term  polynomial  fits  very  well 

with  R2  = 0.999972.  In  fact  the  4 terms  X^Xg,  XgX^  X2,  X2  give  a fit  with 
2 

R = O.962.  The  first  7 terms  obtained  by  the  stepwise  procedure  are  X.^, 


x2x3'  Xl’  X2’  X2X3'  Xl'  X3'  and  have  r2  = °*"2,  With  a throw“out 

2 2 2^ 

criterion  > 1.44,  however,  we  find  that  X^X^,  X^,  X^,  XpC^j  Xy 
X2  fit  with  R2  = 0.996- 

Now  for  the  case  F^  = exp  (-  X^XgX  ) we  found  that  the  47 -term 
2 

polynomial  fit  with  R = 0.99 6.  The  first  seven  terms  obtained  by  the 

step-up  procedure  were  X^_,  X^X  , X2,  X2,  X2,  X^X-y  XiX2X3>  and 

x^xgx3  with  R2  = 0 . 949 . . With  a throw  out  criterion  > 0.8  we  find  that 

the  seven  terms  XgX  , X2,  x|x2,  X^X^X2,  X^X^  X^XgX^  and  X^jpc2  are 

2 

a better  seven  and  fit  with  R = 0.965* 

With  a throw-out  criterion  > 4.9  we  find  that  the  seven  terms 

X±,  X2Xy  x^x2,  XXX2X3,  X^X2X3,  x^2,  x^x|x2  fit  with  R2  = 0.962  and 

that  the  seven  terms  X^,  X^X^X^,  X^XgX  , x^2,  x^x|x2,  X^X^,  and 

X^Xg  fit  with  R2  = 0.978*  We  also  find  in  fact  that  the  first  five 

2 

terms  in  the  last  fit  have  R = O.962.  Thus  the  five  terms  X^  X^XgXy 
X^X  X . X x3c2.  and  X^X^X?  fit  better  than  the  seven  terms  given  by  the 

12  3*  12  3;  123 

step-up  procedure  with  no  throw-out  criterion. 


In  case  F^  = ^=j 


4 3 2 
X^+X* 


'Xl+X2-  IX3 


where  the  denominator  has  zeros  in  the 


region  of  fitting  we  find  that  the  fit  is  not  quite  as  good.  The  47- 
term  polynomial  gives  R = O.938.  Again,  however,  we  find  that  a seven- 
term  polynomial  will  do  almost  as  well.  The  straight  step-up  procedure 
gives  the  seven  terms  X^X^,  X^,  X2,  x|,  X^X^Xy  X^X^X^,  and  X^X^  which 
fit  with  R = 0.894.  With  a throw-out  criterion  > 6.3  we  find  that  the 
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seven  terms  x|,  XjXy  XfXgX2,  X2Xy  x1X2Xy  and  x|x|  fit  with  R2  = 
0.902. 

This  example  also  gave  rise  to  the  situation  where,  while  is 

the  best  single  term,  it  is  not  one  of  the  best  two  terms.  The  best  two 

terms  involving  X^X^  are  xjc  and  X~*  which  fit  with  R2  = 0.733 . However, 

3 2 2 

the  two  terms  X1  and  X^  fit  with  R = O.775.  Another  situation  which 
occurred  on  this  example  was  that  with  a throw-out  criterion  of  > 4.9 

p 

we  would  arrive  at  a five-term  polynomial  with  R = O.876  whereas  the 

step-up  procedure  with  no  throw-out  criterion  leads  to  a five-term 
2 

polynomial  with  R = 0,884*  Hence,  having  a throw-out  criterion  is  not 
always  better. 

As  an  example  of  a non-balanced  design  with  an  arbitrary  linear 

model  we  used  a correlation  matrix  given  in  Anderson  and  Fruchter, 

"Prediction  Selection  Method,"  Psychometrika,  Vol.  25,  No.  1.  The 

results  are  tabulated  in  Run  17 . Here  we  found  that  the  throw-out 

criterion  was  not  used,  and  so  the  variables  were  selected  by  the  step- 

up  procedure  without  this  option.  The  overall  fit  using  l4  variables 
2 

gave  R = 0.270  and  an  F(  14,295)  = 7*8  which  is  significant  at  0.005* 

However,  an  F test  of  the  hypothesis  that  the  last  9 variables  have 

o 

zero  coefficients  is  not  significant  at  even  the  5 0$>  level.  The  R 

2 

for  the  first  five  terms  of  the  step-up  procedure  is  R = O.259. 

Phase  Two  - Run  1 

In  this  run,  the  dependent  variable  was  FCX^X^X^)  = 

(X^+x|+X2)  lxi+x2-  gc3|-V2. 


To  fit  this  expression  we  used  the  polynomial 


model 


3 3 

ix=o  x2=o 


2 V *2  *3 

Z al  i SLX 1 \ % ' 

J03=O  121  123 


All  the  terms,  including  the  dependent  variable  are  first  adjusted  for 
their  means.  Thus  we  wish  to  find  subsets  of  the  V7  terms  in  this 
polynomial  which  give  the  best  approximation  to  the  dependent  variable. 


A, 


yw  -^p  A/p 

The  values  of  F(X^ X^,X^)  and  a^  j were  all  calculated  at  500 


points  in  a balanced  design.  In  this  run  we  used  the  points  X^  = 


•25( .25)  2.50,  X2  = .25( .25)  2.50,  and  X3  = .25( .25)  1.25. 

The  throw-out  criterion  for  this  run  was  F^  = 1.5*  A tabulation 

2 N — 1 2 

6f  the  terms  as  they  were  brought  in  follows.  (Reduced  R ■ is  1 - ( 1-R  ) 

2 

where  R is  the  square  of  the  multiple  correlation  coefficient  and  N = 500, 
the  number  of  observations,  and  m is  the  number  of  terns  in  the  model.) 


Sweep 

m 

Terms 
in  Model 

Term  No 

Term 

F in  F out 

R2 

Reduced 

R2 

1 

1 

37 

1058 

.680 

.680 

2 

2 

36 

4 

98.96 

• 733 

• 733 

3 

3 

2 

x2 

X3 

99.34 

•778 

• 777 

4 

4 

9 

X3 

X2 

103.15 

.816 

.815 

5 

5 

28 

x^x2x3 

292.59 

-3" 

00 

00 

♦ 

.884 

6 

6 

43 

x3x^x3 

25.37 

.890 

.889 

7 

7 

4 

x2x3 

19.01 

.894 

.893 

8 

8 

14 

2 

X1X3 

26.76 

.900 

.898 

* 


* 


▼ 
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Sweep 

m 

Terms 
in  Model 

Term  No 

Term 

F in 

F out 

R2 

Reduced 

R2 

9 

9 

16 

x1x2x3 

15-58 

.903 

.901 

10 

8 

2 

x2 

3 

0.21 

.903 

.901 

11 

9 

17 

X^XgX2 

8.34 

.904 

.903 

12 

10 

19 

x1x|k3 

IO.78 

.906 

• 905 

13 

9 

43 

y3y-^X 

x?2x3 

0.11 

.906 

.905 

l4 

8 

37 

xh 

1.47 

.906 

.905 

15 

9 

l 

X3 

12.94 

.909 

.907 

16 

10 

10 

x243 

11.01 

.911 

.909 

17 

n 

45 

y3v3 

X1X2 

15.92 

.913 

.912 

18 

12 

5 

X2Xf 

14.62 

.916 

.914 

19 

13 

38 

xV2 

X?3 

6.96 

.917 

.915 

20 

14 

2 

x2 

X3 

6.66 

.918 

.916 

21 

13 

1 

X3 

0.15 

.918 

.916 

22. 

l4 

4o 

X^2X3 

6.69 

.919 

.917 

23 

13 

28 

xjx2x3 

0.04 

.919 

.917 

24 

14 

43 

X1X2X3 

3-23 

.920 

.918 

25 

15 

44 

xi2  3 

2.84 

.920 

.918 

26 

l6 

12 

X1 

3.93 

.921 

.918 

27 

17 

11 

x3x2 

x?3 

3.27 

.921 

.919 

28 

16 

4 

X2X3 

0.02 

.921 

.919 

29 

17 

21 

xlxl 

1.62 

.922 

.919 

455 


Run  2 

In  this  run  the  dependent  variable , the  polynomial  model  and  the 
data  points  were  all  the  same  as  in  Run  1.  The  throw-out  criterion  was 
Fq  = 0.9-  This  run  should  tend  to  throw  out  terms  less  often  than  Run  1. 
This  should  lead  to  fewer  sweeps  to  reach  k terms  hut  perhaps  the  fit 
for  these  terms  will  not  he  as  good  as  in  Run  1.  The  tabulation  through 
Sweep  13  is  the  identical  with  Run  1. 


Sweep 

m 

Terms 

in  Model  ' 

Term  No 

Term 

F in 

F out 

R2 

Reduced 

R2 

13 

9 

43 

x3x2x3 

0,11 

.906 

.905 

14 

10 

1 

X3 

11.46 

.909 

•907 

15 

9 

37 

X1X3 

0.05 

.909 

.907 

16 

10 

10 

xix3 

11.01 

• 911 

.909 

Sweeps 

l6  through  29  are  the  same 

as  Run  1 

29 

17 

21 

xix! 

1.62 

.922 

.919 

30 

16 

45 

X3X3 

X1X2 

0.03 

.922 

.919 

31 

17 

26 

X1X3 

1.34 

.922 

.919 

32 

18 

24 

X? 

3.09 

.922 

.920 

33 

19 

13 

X1X2 

1.20 

.923 

.920 

34 

20 

18 

X1X2 

2.86 

•923 

.920 

35 

19 

21 

X1X2 

0.00 

• 923 

.920 

36 

20 

1 

X3 

1.01 

.923 

.920 

37 

19 

11 

3 2 
X2X3 

0.59 

.923 

.920 

456 


Sweep 

m 

Terms 
in  Model 

Terra  No 

Term 

F in 

F out 

R2 

Reduced 

R2 

38 

20 

37 

X1X3 

1.42 

.923 

.920 

39 

21 

47 

3 3 2 
X1X2X3 

1.53 

.924 

.920 

to 

22 

45 

x3x3 

X1X2 

2.13 

• 924 

.921 

4l 

23 

7 

X2X3 

1.94 

.924 

.921 

42 

22 

l6 

w3 

0.31 

.924 

• 921 

43 

23 

23 

xxx3x2 

1.10 

• 924 

.921 

44 

24 

11 

3 2 
X2X3 

1.22 

.925 

.921 

45 

25 

25 

x^x 

X1X3 

0.92 

•925 

• 921 

Run  3 

In  this  run  the  dependent  variable,  the  polynomial  model  and  the  data 
points  were  all  the  same  as  in  Run  1.  The  throw-out  criterion  for  Run  3 
was  Fq  = 8,0.  This  run  should  tend  to  throw  out  terms  more  often  than 
Run  1 or  Run  2.  This  should  lead  to  more  sweeps  to  reach  k terms  but 
hopefully  the  fit  for  these  k terms  will  be  better  than  in  Run  1 or  Run  2. 


( Compare 

, however, 

Run  3, 

Sweep  7> 

with  Run  1,  Sweep  5 and  also 

Run  3, 

Sweep  l8  with  Run 

1,  Sweep  12) . Note  that  in  Run  3 we 

see  that 

the  best 

term  No. 

37  is  not 

one  of 

the  best 

two  terms. 

Sweep 

m 

Terms 
in  Model 

Term  No  Term 

F in  F out 

R2 

Reduced 

R2 

1 

1 

37 

X1X3 

1058.24 

.680 

.680 

2 

2 

36 

X3 

X1 

98.96 

• 733 

•733 

3 

3 

2 

A 

99.34 

•778 

•777 

4 

2 

37 

x?x3 

4.88 

•775 

• 775 

457 


Sweep 

m 

Terms 

in  Model  Term  No 

Term 

F in 

F out 

R2 

Reduced 

R2 

5 

3 

9 

X3 

X2 

102.13 

.8i4 

.813 

6 

4 

15 

X1X2 

219.83 

.871 

.870 

7 

5 

20 

X1XgX^ 

18. 1*8 

.875 

• 875 

8 

6 

26 

2 2 
X1X3 

to.36 

.885 

.884 

9 

7 

17 

X1X2X3 

22.30 

.890 

.889 

10 

6 

20 

xxx|x2 

5.12 

.889 

.888 

11 

7 

5 

X2X3 

1*0. 1+8 

.897 

.896 

12 

8 

20 

20.02 

1 — 1 

O 

o\ 

.900 

13 

7 

15 

X1X2 

6.22 

.900 

.899 

14 

8 

11 

3 2 
X2X3 

12.04 

.903 

1 — l 
O 
ON 

15 

7 

2 

x2 

X3 

2.93 

.902 

1 — 1 

O 

<J\ 

1 6 

8 

38 

3 2 
X1X3 

26.81 

.907 

.906 

17 

9 

18 

X1X2 

22.83 

•911 

.910 

18 

10 

4l 

x3x2x2 

8.20 

.912 

.911 

Run  4 

In 

this  run 

the  dependent  variable. 

the  polynomial  model  and 

the 

data  points  were 

all  the  same 

as  in  Run 

1.  The 

throw-out 

criterion  was 

-3 

Fq  = 10  . This  run  should  not  throw  out  variables  very  often,  at  least 

not  until  they  are  very  insignificant.  A partial  tabulation  of  this 


run  follows. 


Sweep 

m 

Terms 
in  Model 

Tern  No 

Tern 

F in 

F out 

R2 

Reduced 

R2 

i 

i 

37 

X1X3 

1058.00 

.680 

.680 

2 

2 

36 

X? 

98.96 

•733 

•733 

3 

3 

2 

X3 

99.34 

00 

0- 

t"- 

• 777 

4 

4 

9 

X3 

X2 

103.15 

.816 

.815 

5 

5 

28 

x^x2x3 

292.59 

.884 

.884 

6 

6 

43 

x3x2x3 

25.37 

.890 

.889 

7 

7 

4 

X2X3 

19.01 

.894 

• 893 

8 

8 

14 

X1X3 

26.76 

.900 

.898 

9 

9 

1 6 

w3 

15.58 

• 903 

.901 

10 

10 

4l 

3 2 
x^x2x3 

io.4o 

.905 

• 903 

11 

11 

45 

3 3 

xA 

11.46 

.907 

.905 

12 

12 

21 

X1X2 

23.71 

• 911 

.909 

13 

13 

46 

x3x|(3 

6.6  0 

• 913 

.910 

14 

12 

9 

X3 

X2 

0.00 

• 913 

• 911 

15 

13 

1 

X3 

3-93 

• 913 

.911 

1 6 

14 

10 

4*3 

3.10 

.914 

• 911 

20 

1 6 

2 

x^ 

0.00 

.916 

.914 

25 

21 

25 

X1X3 

3.67 

.920 

.917 

30 

22 

24 

4 

0.00 

.923 

• 920 

35 

27 

39 

X1X2 

5-35 

.927 

• 923 

40 

30 

9 

X3 

X2 

0.00 

.92828 

.92385 

45 

31 

20 

xxx2x2 

0.53 

.92866 

. 92410 

50 

34 

34 

x^x3x3 

15.23 

.93196 

.92714 
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Sweep 

in 

Terms 
in  Model 

Term  No 

Term 

F in  F out 

R2 

Reduced 

R2 

55 

37 

6 

4 

0.00 

.93459 

O 

IT\ 

Ox 

C\J 

o\ 

% 

60 

40 

42 

44 

0.44 

.93682 

.93146 

65 

45 

18 

X x2 
12 

0.55 

.93730 

.93124 

V 

66 

46 

9 

X3 

X2 

i.o4 

.93745 

.93125 

67 

47 

30 

2 2 
X1X2 

5.08 

•938142 

.931860 

Run  5 

In  this  run,  the  dependent  variable  was  FCX^X^X^)  = exp  (-X^XgX^). 

We  used  the  same  balanced  polynomial  model  as  in  the  first  four  runs , cubic 
in  X1  and  X 2,  quadratic  in  Xy  The  500  data  points  were  in  the  same 
balanced  design,  X^  = . 25( .25)  2.50,  X 2 = .25(.25)  2.50,  X^  = .25(.25)  I.25. 
The  polynomial  model  in  this  case  should  fit  better  than  in  the  first 
four  runs. 


The 

Sweep 

throw-out 

m 

Terns 
in  Model 

criterion 
Term  No. 

in  the  first  runs  in  this 
Term  F in  F out 

series  was 
R2 

F0  - 1.5. 

Reduced 

R2 

1 

1 

12 

xi 

836.43 

.627 

.627 

2 

2 

4 

x2x3 

529.77 

.819 

.819 

3 

3 

24 

x? 

212.65 

.874 

.873 

4 

4 

8 

X2X3 

308.40 

.922 

.922 

5 

5 

44 

X3X^X2 

61.04 

• 931 

• 930 

6 

6 

16 

x1x2x3 

103.35 

.943 

.942 

7 

7 

28 

X1X2X3 

62.76 

.949 

.949 

8 

8 

20 

2 2 
X1X2X3 

231.18 

.965 

.965 

460 


A 


Sweep 

m 

Terms 
in  Model 

Term  No 

Term 

9 

7 

12 

xi 

10 

8 

23 

3 2 
X1X?3 

n 

9 

22 

Xi4x3 

12 

10 

21 

xi  4 

13 

li 

27 

A 

X1X2 

l4 

12 

25 

x^x 

XiX3 

15 

13 

30 

2 2 
X1X2 

l6 

14 

38 

8 2 
XlX3 

17 

13 

24 

x2 

xi 

18 

14 

33 

x^3 

X1X2 

19 

15 

39 

X1X2 

20 

16 

32 

x2x2x2 

25 

17 

42 

44 

26 

18 

35 

27 

17 

23 

xxx3x2 

28 

18 

8 

#3 

29 

19 

14 

\4 

30 

20 

45 

X3X3 

X1X2 

35 

25 

21 

X X3 

XI  2 

4o 

26 

21 

*£ 

44 

26 

21 

xA 

Reduced 


F in 

F out 

R2 

R2 

0.79 

.965 

.965 

27.03 

.967 

.967 

138.53 

•974 

• 974 

411.57 

.986 

.986 

8.79 

.986 

.986 

97-48 

• 989 

.988 

67.45 

•990 

•990 

73-37 

1 — 1 

On 

On 

• 991 

0.03 

.991 

♦ 991 

32.08 

.992 

.992 

23.02 

.992 

.992 

75-84 

.993 

•993 

33.90 

.99431 

.99412 

14.09 

.99449 

.99428 

1.20 

.99445 

•99427 

11.81 

.99459 

.99440 

18.00 

.99479 

•99459 

5-57 

•99485 

.99464 

11.63 

•99594 

•99573 

1.12 

.99604 

.99583 

0.93 

.99606 

.99585 

461 


Hun  6 

This  run  used  the  same  dependent  variable,  polynomial  model  and  data 
points  as  in  Run  5*  The  throw-out  criterion  was  FQ  = 0,9*  This  will  tend 
to  throw  out  terns  less  often  than  in  Run  5.'  In  fact,  however,  the  runs 
are  identical  through  Sweep  26. 


Sweep 

m 

Terms 
in  Model 

Term  No 

Term 

F in 

F out 

R2 

Reduced 

R2 

25 

17 

42 

3 2 
hX2 

33.90 

.99431 

.99412 

26 

18 

35 

x2x^x2 

14.09 

.99447 

.99428 

27 

19 

11 

3 2 
x^x3 

13.66 

.99462 

.99492 

28 

18 

20 

2 2 
xxx2x3 

0.00 

.99462 

.99443 

29 

19 

14 

16.67 

.99480 

.99461 

30 

20 

45 

x^3 

6.19 

.99487 

.99)467 

35 

23 

42 

3 2 
X1X2 

0.44 

.99574 

•99555 

40 

26 

11 

3 2 
XiX3 

0.48 

.99609 

.99588 

hi 

25 

38 

3 2 
X1X3 

0.59 

.99608 

.99589 

42 

24 

i4 

X1X3 

0.61 

.99608 

.99589 

43 

25 

31 

x^3 

0.79 

.99608 

.99589 

Run  7 

In  this 

run  the 

dependent 

variable, 

the  polynomial  model  and  the 

data  points  were  all 

the  same  as  in  Run  5.  The 

throw-out 

criterion 

was 

Fq  = 8.0.  The  variables  brought  in  were  the  same  as  in  Run  5 through 
Sweep  7* 
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Sweeu 

Terms 
in  Model 

Term  No 

Tern 

7 

7 

28 

8 

6 

24 

4 

9 

5 

44 

X1X2X3 

10 

6 

20 

xxx|c2 

n 

7 

47 

X1X2X3 

12 

6 

4 

X2X3 

13 

5 

8 

x4 

X2X3 

14 

6 

31 

X1X2X3 

15 

7 

27 

1 6 

8 

23 

3 2 
x^x^ 

17 

9 

4 

X2X3 

18 

10 

25 

13 

19 

11 

26 

X1X3 

20 

12 

36 

21 

13 

44 

3 2 2 

22 

14 

30 

2 2 
Xl2 

23 

15 

33 

x¥ 

X1X2 

24 

14 

47 

x^x2 

25 

13 

12 

X1 

26 

14 

8 

2 2 
X2X3 

27 

15 

22 

X1X2X3 

Reduced 


F in 

F out 

R2 

R2 

62.76 

.94925 

.94863 

4.4l 

.94879 

.94827 

4.83 

.94829 

.94787 

69.96 

.95471 

.95425 

100.42 

.96239 

.96193 

1.8l 

.96225 

.96187 

1.07 

.96217 

.96186 

63.95 

.96651 

.96617 

269.22 

.97836 

.97809 

123.13 

.98270 

.98245 

73.73 

.98496 

.98471 

72.64 

.98690 

.98666 

50.82 

.98814 

.98790 

52.25 

.98929 

.98905 

46.6i 

.99023 

.98999 

47.15 

•99109 

.99085 

78.01 

.99233 

.99211 

0.88 

.99232 

•99211 

3.00 

•99227 

.99208 

55.18 

.99306 

.99287 

6.18 

•99314 

.99295 

Run  8 


In  this  run  the  dependent  variable,  the  polynomial  model  and  the  data 

-3 

points  were  all  the  same  as  in  Run  5*  The  throw-out  criterion  was  F^  = 10 
The  variables  brought  in  were  the  same  as  in  Run  5 through  Sweep  8. 


Sweep 

m 

Terms 
in  Model 

Term  No 

Term 

F in  F out 

R2 

Reduced 

R2 

8 

8 

20 

2 2 

X1X2X3 

251.18 

.96550 

.96501 

9 

9 

23 

3 2 
x1x^x3 

26.72 

.96728 

.96675 

10 

10 

22 

x1x^x3 

137.60 

.97447 

.97400 

ii 

11 

21 

X x3 
X1X2 

416.70 

.98623 

.98595 

12 

12 

36 

X1 

32.83 

.98710 

.98681 

13 

13 

ho 

x^x2x3 

50.05 

.98830 

.98801 

14 

14 

15 

X1X2 

9.58 

.98853 

.98822 

15 

15 

13 

X1X3 

148.72 

•99122 

.99097 

20 

20 

6 

x2 

X2 

20.37 

.99481 

.99461 

25 

25 

33 

X^3 

12 

10.32 

•95537 

•99514 

30 

30 

7 

X2X3 

4.12 

.99576 

•99550 

35 

35 

30 

2 2 
X1X2 

3-22 

.99619 

•99591 

40 

38 

17 

Xxx2x2 

5.45 

.99630 

.99601 

45 

43 

38 

32" 

X1X3 

2.68 

.99638 

.99605 

50 

44 

8 

x2x2 
2 3 

0.87 

.99640 

.99606 

54 

46 

32 

x^2 

X1X2X3 

0.18 

.996399 

.996042 

55 

47 

46 

3 3 

XIX2X3 

2.21 

.996416 

.996052 

56 

46 

12 

X1 

0.00 

.996416 

.996061 

57 

47 

12 

X1 

0.00 

.996416 

.996052 

464 


Run  9 


9 


> 


S 


In  this  run  the- dependent  variable  vas  F(XpX2,X  ) = yf 


2 2 2 
VW 


The  47-tenn  balanced  polynomial,  cubie  in  X^  and  X^  and  quadratic  in  X^, 
was  used  as  the  model  to  fit  the  dependent  variable  over  the  500  data  points 
Xx  - .25(.25)  2.50,  X2  = .*25  ( • 25)  2.50,  and  Xg  - .25(.25)  1.25. 

As  expected  in  this  case,  the  fit  is  very  good.  Because  of  the  symmetry 
involved  the  terms  in  X^  and  X^  should  be  the  same,  at  least  in  the  com- 
plete model.  The  lack  of  symmetry  in  the  way  these  terms  were  brought  is 
interesting. 


The  throw-out  criterion  for  this  run  was  FQ  = 1.5. 


Sweep 

m 

Terms 
in  Model 

Term  No 

Term 

F in  F out 

K2 

Reduced 

R2 

1 

1 

15 

X1X2 

1337.01 

.72861 

.72861 

2 

2 

4 

X2X3 

98.19 

.77338 

.77293 

3 

3 

24 

X? 

309.02 

.86037 

.85981 

h 

4 

6 

X^ 

1324.25 

.96201 

.96178 

5 

5 

7 

X2X3 

302.54 

.97644 

.97625 

6 

6 

12 

X1 

553.58 

.98890 

.98879 

7 

7 

2 

X2 

3 

202.14 

.99213 

•99204 

8 

8 

3 

X2 

427-10 

•99579 

•99573 

9 

9 

4 

X2X3 

1.44 

•99578 

•99573 

10 

8 

14 

469.05 

.99784 

•99781 

11 

9 

19 

x1x^x3 

169.84 

.99840 

.99837 

12 

10 

5 

OJ 

X 

177-58 

.99882 

46-5 


Sweep 

m 

Terms 
in  Model 

Term  No 

Term 

13 

11 

9 

xi 

ik 

12 

36 

xi 

15 

13 

17 

16 

14 

21 

xix! 

17 

15 

39 

X1X2 

18 

16 

30 

2 2 
X1X2 

19 

17 

1 

x3 

20 

18 

4 

X2X3 

25 

23 

16 

x2xxx3 

30 

26 

38 

3 2 
X1X3 

35 

27 

20 

X1X2X3 

40 

30 

21 

X1X2 

45 

35 

45 

X3X3 

X1X2 

46 

34 

39 

X1X2 

47 

35 

21 

xixi 

48 

36 

4l 

3 2 

x^x2x3 

Run  10 

Reduced 


F in 

F out 

R2 

144.20 

.99909 

204.27 

.99936 

171.96 

•99953 

87.04 

.99960 

59.18 

.99964 

125.68 

.99972 

55.81 

•99975 

114.22 

.99980 

107.82 

.99994 

40.36 

.99996 

0.09 

.99996 

0.00 

.99996 

9-^3 

.999969 

0.26 

.999969 

3.23 

.999969 

o.4o 

.999969 

In  this  run  the  dependent  variable,  the  polynomial  model  and  the 
data  points  were  the  same  as  in  Run  9*  The  throw-out  criterion  for  this 
run  was  FQ  = 0.9.  The  tabulation  of  the  results  is  identical  with  Run 
9 through  Sweep  8. 
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/ 


Sweep 

m 

Terms 
in  Model 

Term  No 

Term 

8 

8 

3 

X2 

9 

9 

14 

10 

10 

16 

X1X2X3 

li 

11 

1 

X3 

12 

12 

9 

X2 

13 

13 

36 

X? 

14 

l4 

13 

X1X3 

15 

15 

45 

y3y3 

X1X2 

16 

16 

35 

2 12 
xpc^ 

17 

17 

37 

X1X3 

18 

18 

21 

X1X2 

19 

19 

39 

X1X2 

20 

20 

31 

y2-y-2y 

X1X2X3 

25 

25 

23 

3 2 

X1X2X3 

30 

28 

29 

2 2 
X1X2X3 

1)0 

30 

45 

y;3y3 

X1X2 

50 

32 

29 

2 2 
X1X2X3 

6o 

36 

4l 

1 2 
X1X2X3 

65 

37 

32 

X1X2X3 

Run  11 


Reduced 


F in  F out 

R2 

427.10 

•99579 

470.79 

.997854 

174.96 

•998420 

212.37 

.998899 

156.79 

.999167 

230.76 

•999435 

279.64 

.999642 

82.42 

.999694 

52.71 

.999724 

107.84 

.999774 

47.97 

•999795 

259.26 

.999867 

202.22 

.999906 

76.83 

•999950 

3-73 

•999957 

6.72 

.999960 

4.50 

.999963 

10.53 

.999967 

0.86 

.999968 

In  this  run  the  dependent  variable,  the  polynomial  model  and  the  data 
points  were  the  same  as  in  Run  9«  The  throw-out  criterion  was  F^  = 8.0. 

The  variables  were  included  in  the  same  order  as  in  Run  9 through  Sweep  15* 
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m 

Terms 


Sweep 

in  Model 

Term  No 

Term 

F in  F out 

R = Reduced  R 

15 

13 

17 

x1x2x^ 

171.96 

.999528 

1 6 

12 

7 

x|x3 

2.77 

.999525 

17 

13 

39 

X?X2 

49.15 

.999569 

18 

14 

1 

x3 

37.69 

.999600 

19 

13 

19 

XLX2X3 

2.18 

.999598 

20 

Ik 

21 

X1X2 

59.03 

.999642 

21 

15 

30 

2 2 
X1X2 

142.48 

.999723 

22 

16 

13 

X1X3 

52.63 

.999750 

23 

17 

4 

X2X3 

58.95 

.999778 

24 

18 

16 

xxx2x3 

60.90 

.999803 

25 

17 

17 

X1X2X3 

0.20 

.999802 

26 

18 

16 

X2X3 

61.82 

.999825 

27 

19 

22 

x^^ 

177.78 

.999872 

28 

20 

37 

X1X3 

102.41 

.999895 

29 

21 

4o 

x^x2x3 

390.39 

.999942 

30 

22 

46 

x^x3 

38.26 

.9999^ 

35 

25 

38 

xh 

10.63 

.999957 

36 

2k 

l4 

P 

X1X3 

2.26 

.999957 

37 

25 

47 

x^x^x3 

7.35 

.999958 

Run  12 

In  this  run  the  dependent  variable,  the  polynomial  model,  and  the 

-3 

data  points  were  the  same  as  in  Run  9-  The  throw-out  criterion  was  Fq  = 10 
The  variables  came  in  the  same  order  as  in  Run  10  through  Sweep  28. 

No  throw-outs  were  made. 
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m 


Sweep 

Terms 
in  Model 

Tern  No 

Term 

F in  F out 

R^  = Reduced  ' 

25 

25 

23 

3 2 
xxx^x3 

76.83 

.999950 

30 

30 

42 

3 2 
X1X2 

6.11 

.999958 

> 

35 

35 

43 

3 2 

W3 

15.94 

.999962 

40 

4o 

28 

X1X2X3 

21.80 

.999967 

45 

45 

25 

3 

28.42 

.999971 

46 

46 

34 

x^x3 

11.16 

.999972 

b7 

4 7 

47 

X1X2X3 

1.10 

.999972 

Run  13 

In  this  run  the  dependent  variable  was  = exp  (-X^X^X^) 

as  in  Run  9*  The  polynomial  model  vas  the  same  47-term  balanced  polynomial 
cubic  in  X^  and  X quadratic  in  X^.  There  were  1000  data  points  in  a rec- 
tangular design  Xx  = .25(.25)  2.50,  X2  = .25(.25)  2.50,  X3  = .25(.25)  2.50. 

On  this  run  the  throw-out  criterion  was  FQ  = 1.0. 


Sweep 

m 

Terms 
in  Model 

Term  No 

Term 

F in 

F out 

R2 

1 

1 

12 

xi 

1277.16 

.561 

2 

2 

4 

X2X3 

619.57 

.729 

3 

3 

24 

x2 

X1 

582.22 

.829 

4 

4 

8 

XV 

2X3 

472.27 

.884 

5 

5 

28 

X1X2X3 

267.53 

.9088 

6 

6 

16 

X1X2X3 

69.37 

• 9147 

7 

7 

bo 

4X2X3 

225.79 

.9305 
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Sweep 

m 

Terms 
in  Model 

Term  No 

Term 

F in 

F out 

R2 

8 

8 

36 

xi 

27-10 

.9324 

9 

9 

11 

x3x2 

27.63 

.9342 

10 

10 

10 

4*3 

151-95 

.9430 

11 

n 

9 

X3 

X2 

234.89 

• 9539 

12 

12 

13 

X1X3 

2304 

• 9550 

13 

13 

15 

X1X2 

141.45 

.9606 

l4 

l4 

44 

x3x2x2 

84.24 

.9637 

15 

15 

20 

X1#3 

257-55 

.9713 

1 6 

l6 

14 

v! 

130.45 

• 9746 

17 

15 

12 

xi 

0.01 

.9746 

18 

l6 

18 

xi4 

307.20 

.980673 

19 

17 

21 

h4 

113.99 

.982683 

20 

16 

4 

x2x3 

0.53 

.982674 

21 

17 

32 

222 

X1X2X3 

20.32 

.983025 

22 

16 

44 

x3x2x2 

0.09 

.983023 

23 

17 

12 

X1 

31.96 

.983658 

24 

18 

4 

X2X3 

53.90 

•984415 

25 

17 

9 

X3 

X2 

0.75 

.984403 

30 

22 

19 

X^^ 

8.60 

.986977 

35 

25 

44 

y3y2  2 
A-la2a3 

6.49 

.988926 

36 

26 

22 

xxx3x3 

3-53 

.988966 

37 

27 

1 

X3 

2.05 

.988989 
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In  this  run  the  dependent  variable,  the  polynomial  model  and  the  data 

points  were  the  same  as  in  Run  13 . The  throw-out  criterion  was  F = 10 

o 

The  tabulation  is  identical  with  Run  13  through  Sweep  1 6. 


Sweep 

m 

Terms 
in  Model 

Term  No 

Term 

F in 

F out 

R2 

1 6 

16 

i4 

130.45 

.9746 

17 

17 

18 

xi 4 

318.07 

•98084 

18 

18 

32 

444 

144.76 

.98330 

19 

19 

21 

X X3 
X1X2 

155.92 

.98560 

20 

20 

to 

Y3Y3Y2 

16.89 

.98584 

21 

21 

34 

2 3 

w3 

53.87 

.98658 

22 

22 

19 

xxx^x3 

4.65 

.98664 

23 

23 

17 

xxx2x3 

25.46 

.98698 

24 

24 

33 

xlxl 

89.60 

.98808 

25 

25 

23 

3 2 

W3 

15.91 

.98827 

30 

30 

2 

x2 

X3 

10.28 

.98926 

35 

35 

38 

3 2 
X1X3 

4.23 

.98950 

ko 

36 

22 

xxx3x3 

0.91 

.989603 

45 

37 

to 

X1X2X3 

0.00 

.989686 

50 

to 

26 

44 

2.43 

.989835 

55 

39 

21 

0.00 

.989862 

6o 

42 

9 

to51 

.990066 

65 

45 

9 

0.05 

•9900to 

66 

46 

9 

0.00 

.990040 

Run  15 


In  this  run,  the  data  were  taken  from  Bulletin  336,  Agricultural 

Experiment  Station,  Auburn  University,  Auburn,  Alabama. 

-3 

The  throw  out  was  FQ  = 10  but  was  never  used. 


Sweep 

Terms 
in  Model 

Term 

No  Term 

F in 

R2 

R2 

1 

1 

4 

X4 

86.98 

.696 

.696 

2 

2 

2 

X2 

3-14 

.720 

.712 

3 

3 

5 

x2 

X4 

0.64 

.725 

0 

1 — 1 
tr- 

4 

4 

3 

X3 

0.24 

.726 

.704 

■5 

5 

6 

XlX4 

0.13 

.728 

.696 

6 

6 

1 

Xl 

0.58 

•732 

.693 

Run  l6 

This  run  used 

the  same 

data  as  in  Run  15,  but  the 

polynomial  model 

was  taken  to  be  a balanced  polynomial  linear  in  X , 

, and  X^ 

and  quad- 

ratic 

in  X^.  This 

gives  23 

terms  in  addition  to  the  constant  term. 

Sweep 

m 

Terms 
in  Model 

Term  No 

Term 

F in  F out 

R2 

Reduced 

R2 

1 

1 

1 

86.98 

.69596 

.69596 

2 

2 

5 

X3X4 

3.38 

.72142 

• 71409 

3 

3 

3 

X3 

o.4i 

•72453 

.70964 

4 

4 

7 

X2X4 

1.13 

•73314 

.71091 

5 

5 

14 

X1X4 

1.71 

.74590 

.71686 

6 

6 

12 

X1 

5-75 

.78361 

•75179 
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Sweep 

Terms 
in  Model 

Term  No 

Term 

F in 

F out 

R2 

Reduced 

R2 

7 

7 

9 

X2X3 

0.55 

.78724 

' .74856 

8 

8 

l6 

V3X4 

2.55 

.80340 

.76040 

9 

9 

2 

x2 

x4 

1.51 

.81280 

.76449 

10 

10 

23 

XlX2X3X4 

3-^9 

.83293 

.78281 

li 

11 

4 

X3X4 

0.34 

.83491 

•77798 

12 

12 

6 

X2 

O.98 

.840665 

.778069 

13 

13 

17 

X1X3X4 

0.74 

.845059 

.776197 

l4 

14 

13 

XlX4 

2.02 

.856626 

•784939 

15 

13 

23 

X1X2X3X4 

0.00 

.856624 

.792901 

l6 

Ik 

8 

X2X4 

0.64 

.860176 

.790265 

17 

15 

15 

X1X3 

0.27 

.861736 

.784308 

18 

l6 

21 

X1X2X3 

0.10 

.862309 

.776253 

19 

17 

10 

x2x3x4 

0.35 

.864448 

.770151 

20 

16 

12 

X1 

0.00 

.864446 

.779725 

21 

17 

18 

X1X2 

0.50 

.867472 

.775279 

22 

.18 

23 

hX2X3Xk 

0.89 

.872836 

.774572 

23 

17 

21 

X1X2X3 

0.00 

.872835 

•784373 

24 

18 

19 

X1X2X4 

0.6l 

.876437 

.780957 

25 

17 

3 

x3 

0.00 

.876435 

.790476 

26 

-18 

11 

x2x3x2 

0.6l 

.879902 

.787099 

27 

19 

12 

X1 

0.12 

.880637 

.778325 

28 

20 

22 

X1X2X3X4 

0.19 

.881802 

.769515 

Sweep 

m 

Terms 
in  Model 

Term  No 

Term 

F in  F out 

R2 

Reduced 

R2 

29 

21 

3 

X3 

0.20 

.883701 

.761282 

30 

20 

19 

0.00 

.883697 

.773210 

31 

21 

20 

vA 

0.13 

.884550 

.760023 

32 

22 

21 

X1X2X3 

0.06 

.884958 

.750743 

33 

23 

19 

X]x2x4 

0.01 

.885035 

.736256 

34 

22 

18 

X1X2 

0.00 

.885034 

.750906 

35 

23 

18 

X1X2 

0.00 

.885035 

.736256 

Run  17 

In  this  run  the  data  were  a correlation  matrix  taken  from  Anderson, 

H.  E. , and  Fruchter,  B. , "Predictor  Selection  Methods,"  Psychometrika, 

_ o 

Vol.  25,  No*  1,  March  1960.  In  this  run  the  throw-out  criterion  of  FQ  = 10 
was  never  used. 


Sweep 

m 

Terms 
in  Model 

Term  No 

F in 

R2 

Reduced 

R2 

1 

1 

6 

56.94 

.156025 

.156025 

2 

2 

4 

21.38 

. 210965 

.208403 

3 

3 

3 

10.18 

.236372 

.231397 

4 

4 

13 

4.90 

.248451 

.241083 

5 

5 

12 

4.13 

.258529 

. 248805 

6 

6 

10 

1.38 

. 261881 

.249741 

7 

7 

1 

0.92 

. 264125 

•249553 

8 

8 

8 

0.71 

. 265861 

. 248844 

* 


* 


* 
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Sweep 

m 

Terms 
in  Model 

Term  No 

F in 

R2 

Reduced 

R2 

9 

9 

2 

0.42 

. 266898 

. 247413 

10 

10 

5 

0.37 

.267803 

.245837 

11 

11 

9 

o.4o 

. 268785 

. 244330 

12 

12 

7 

0.29 

.269503 

.242538 

13 

13 

11 

0.17 

. 269932 

.240435 

14 

Ik 

l4 

0.02 

. 269970 

.237908 

Conclusions 

We  feel  that  the  step-up  procedure  is  an  effective  tool  in  the  problem 
of  finding  a regression  equation  with  a small  number  of  estimation  variables 
from  a model  with  a large  number.  Using  the  various  throw-out  criteria  and 
stopping  rules,  the  problems  of  interest  could  be  explored.  The  throw-out 
criterion  and  stopping  rule  which  best  fit  the  problem  could  be  selected  and 
then  a regression  equation  deteimined.  We  feel  that  most  future  investigation 
of  this  procedure  should  be  problem-oriented.  We  need  the  data  for  a problem 
to  help  develop  an  effective  way  of  handling  the  data. 


SELECTION  OF  SIGNIFICANT  ESTIMATION  VARIABLES 
IN  A LEAST  SQUARES  PROBLEM:  COMPUTER  PROGRAMS 


1.  Comparison  of  variables  selected  by  step-up  procedure  with 
opt imal  set . This  procedure  was  programmed  in  the  ALGOL  58  compiler 
language  for  the  Burroughs  220  computer.  Because  of  limitations  on 
the  memory  the  procedure  is  restircted  to  25  variables. 

The  purpose  of  the  program  is  to  determine  whether  or  not  the 
step-up  procedure  actually  selects  the  best  k estimation  variables. 
This  program  was  preliminary  to  a more  elaborate  program  for  the  Bur- 
roughs 5000. 

First,  the  data  are  generated.  The  estimation  variables 
are  terms  of  a balanced  polynomial  in  independent  variables 


X 


1' 


• • X « 1 .e  . , 

7 TT  ; 


1 


=x. 


'...X 


TT 


TT 


> ^-0?  • • • > L^j  i-1,  2, 


where  (X  , . . .,  X^)  takes  on  all  possible  values  in  the  given  range  except 

(0,  . ..,0).  Certain  terms  of  the  balanced  polynomial  are  to  be  used  to 

estimate  a dependent  variable,  which  is  some  function  of  the  X 1 s . It  is 

convenient  to  label  this  variable  Z . Corresponding  to  an  index, 

t .=1,2,  . . .,T.,  i=l,2,  . . .,tt,  the  observed  value  of  X.  is  x.,  . Thus, 

1 9 7 9 3/  3 7 7 7 x ltj_  9 

corresponding  to  the  set  j^t^,  . . .,  tn)  | t^=l,  2,  . . .,T^,  i=l,  2,  . . ., tt|  is  a 

rectangular  set  of  data-points  |(x^  ,...,x^)|  from  which  are  calculated 

observed  values,  (z  n,...,z  _,z  ),  of  the  vector  consisting  of  the 

7 v \xl9  7 p, n-1*  \m/7  ^ 

estimation  variables  and  the  dependent  variable. 

Next,  regression  analyses  are  made  using  all  possible  combinations 
of  k estimation  variables,  where  k=2,  ...,n-2.  For  each  k,  the  combinations 
of  variables  which  give  maximum  and  minimum  sums  of  squares  due  to  re- 
gression (and  hence  maximum  and  minimum  multiple  correlation)  are  printed 
along  with  the  sums  of  squares. 


* 


A 
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Finally,  the  step-up  procedure  is  used.  At  the  k'th  step,  the 

variable  is  selected  from  those  not  already  included  which  maximizes 
f k ' ) ( k 1 ) 

S^n  / S^k  ' . The  procedure  then  uses  that  variable  Z^,  as  the  pivot 
variable.  It  makes  the  following  calculations: 


(k’+l)  S. 


(k«) 


kj 


kj 


Tk7! 


kk 


j — 1,  2,  m . .n 


o (k'L  (k’) 

(k'+l)  (k1)  Sik  Skj 

Sij  ij  “ [FT 

Skk 


i — 1,  . • .k-1,  k+1,  • • .n,  j— 1,  2 


In  these  calculations  (S.  .)  is  the  augmented  matrix  of  dot  products  of 

the  estimation  vectors  and  the  dependent-variable  vector.  The  superscript 

k*  indicates  the  number  of  transformations  on  (S.  .)  in  which  a column  has 

ij 

been  reduced  to  a unit  vector.  The  list  of  variables,  included  in  the 
regression,  and  the  sum  of  squares  due  to  regression  are  printed. 

In  some  cases  the  stepwise  procedure  gave  optimal  solutions,  while 
in  others  it  did  not.  In  an  attempt  to  run  the  program  with  18  variables 
the  time  required  to  calculate  the  regression  analyses  for  all  combina- 
tions of  variables  turned  out  to  be  prohibitive. 


Operating  Instructions  for  B-220  Program 

1.  Load  the  program,  with  the  proper  procedure  (FCN)  inserted  to 
calculate  the  independent  polynomial  variables  and  the  dependent  variables 

2.  Load  the  following  data  card,  using  more  than  one  card  if  nec- 


essary, with  5 punched  in  the  first  column  of  each  card. 


Card  Contents 


Card  Format 


a)  Number  of  independent  polynomial 
variables 

b)  Number  of  observations  of 
independent  polynomial  variable 

c)  Repeat  (b)  for  each  variable 

d)  Order  in  independent  polynomial 
variable 

e)  Repeat  (d)  for  each  variable 

f)  Lower  bound  for  diagonal  element 

g)  Lower  bound  for  difference  be- 
tween 1.0  and  off-diagonal  cor- 
relation 

h)  F-statistic  for  stopping 


Skip  at  least  one  column;  punch 
integer 

Skip  at  least  one  column;  punch 
integer 


Skip  at  least  one  column;  punch 
integer 


Skip  at  least  one  column;  punch 
floating  point  number 

Skip  at  least  one  column;  punch 
floating  point  number 


Skip  at  least  one  column;  punch 
floating  point  number;  leave  rest 
of  card  blank. 


3.  Repeat  (2)  for  each  analysis  to  be  made. 

4.  Load  2 blank  cards. 

2.  Comprehensive  program  for  sele ction  of  variables  with  step-up 
procedure  incorporating  elimination  rules  and  stopping  rules . This 
procedure  attempts  to  select  the  most  significant  estimation  variables 
for  a least  squares  fitting.  It  has  been  programmed  for  the  Burroughs 
5000  computer  in  the  ALGOL  60  compiler  language. 

There  are  three  options  for  obtaining  the  n x n augmented  (S.  .) 
matrix 

(1)  Either  the  (S.  .)  matrix  or  the  correlation  matrix  may  be  read 

1 j 

in.  (Only  the  diagonal  and  lower  triangle  are  read  in.) 

(2)  Each  of  the  M observations  ( z , . . z ) may  be  read  in.  An 
estimate  (itl^  . . .,in  ) of  the  means  is  available.  As  the  data  are  read 
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in,  the  sums 


M 

S.  = s (Z  . -m. ) 
1 , hi  i' 

li=l 


, M 

S.  . = s (Z  .-m.)  (Z  .-m.) 

10  |i-l  ^ 1 W J 


i ~ 1>  2,  • • .n,  j — 1,  2,  • • • , i 


are  calculated.  The  adjusted  (S _ ) matrix 


S.  . = S 


S.S. 
I J 


ij  ij  M 


i — 1, 2,  • » * n,  j — 1,  2,  « . . , l 


is  then  computed. 

(3)  Each  observation  may  be  generated  from  balanced  polynomials. 
A set  of  fixed  data  points  (x  , . . .x  ) is  given.  The  estimation 

Pi  [J.TT 

variables  are  the  terms  of  a balanced  polynomial,  so  that 


It\ 

z . = x _ x ^ . . .x 

pk  |ll  |l2  |1TT 

where  >0^  = 0,1,  ...,1k,  i = 1,2,.  ..tt.  Each  of  these  combinations  of 
exponents  (except  all  exponents  zero)  corresponds  to  one  estimation 
variable.  The  values  x , . ..,x  may  be  read  in,  or  they  may  be  part 
of  a rectangular  design,  with  each  |i  corresponding  to  some  value  of 
the  index  (t^  . . .,t^),  where  t±=  1,...,^,  i = 1,2,  ...,n.  Values 
z of  the  dependent  variable  may  be  read  in  or  they  may  be  computed 

rXXl 
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values  of  a specified  function,  corresponding  to  values  x^,...,x  . 

These  vectors  x -^...,x  z are  generated  in  a procedure  which  may 
he  varied  with  each  run.  As  the  observations  z^,  are  &enerated, 

the  sum  of  squares  matrix  (S^)  is  calculated  as  above. 

Once  the  adjusted  sum  of  squares  matrix  has  been  obtained  it  may 
be  used  for  more  than  one  analysis.  The  diagonal  and  lower  triangle 
only  are  used  in  the  analysis.  Since  the  matrix  is  symmetric,  the 
necessary  values  may  be  stored  in  the  upper  triangle  (with  the  diagonal 
in  a separate  vector)  for  performing  other  analyses  under  different 
conditions . 

If  the  correlation  matrix  was  read  in,  it  is  used  in  the  regression 
analysis;  otherwise,  there  is  the  option  of  computing  and  using  the 


correlation  matrix.  The  matrix  to  be  used  shall  be  denoted  as 
The  program  includes  the  option  of  printing  this  matrix. 


(?Jor> 


In  a hand  computation  the  system  of  normal  equations  would  be  solved 
for  regression  coefficients  in  a sequence  of  gaussian  eliminations,  and 


the  inverse  matrix  would  be  built  up  on  a unit  matrix.  The  initial 
tableau  for  such  an  elimination  and  matrix  inversion  procedure 

would  be  defined  by 
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1J 

i — . nj  j 

- 1^  2^  . . . } i 

4« 

fii/0)  - i 

3 1 

i = 2^  • • .n-1; 

j = • • • y H 

> 

1 

i = . * *ii j j 

- n+1 

0 

i = j 

= n+l,  . ..,2n,  j / n+i 

The  original  S matrix  is  of  the  form 


i 


(0) 


11 


21 


(0) 


22 


(0) 


S , ,^°^s  - J0)...S  . 

n-x,  x n-x^  n-x^n-l 


S <°>  S J°)  ...S  ^ S 

nl  n2  . n,  n-1  nn 


while  the  original  R matrix  is  of  the  form 


* 


jk 


48i 


1 


0 


0 


s <°>  3 <°> 

-1 


3 , ,(0)  S <°> 

n-1, 1 nl 


s (0)  s (0)  s (0)  S (0)  o 1 0 

S21  b22  •*'  n-1, 2 n2 


S -/°)  S ...  S , -,^0^  S ^ 0 0 ...1  0 

n-1, 1 n-1,2  n-l,n-l  n,n-l 


s <°>  s <°> 

nl  n2 


s <°>  s <°> 

n,n-l  nn 


0 0 ...0  1 


Because  of  symmetry  operations  need  to  be  made  only  on  the  lower  tri- 
angle of  the  S matrix.  Hence  the  entire  R matrix  need  not  be  stored  in 


The  stepwise  procedure  now  begins.  It  is  assumed  that  at  the  k’th 
step,  k estimation  variables  Z , ...,Z  are  included  in  the  regression, 

pi  pk 

while  the  n-k-1  variables  Z , ...,Z  are  excluded.  The  variables 

h Vk-1 


2 

Z and  Z which  minimize  (S  ^k  ^ )/s  ^ ^ 

Tl  n ' ir\  ' ‘ rv  -rv 


P q . 

max  man 


and  maximize 


(S  ^ )/S  ^ \ respectively,  are  determined.  The  variable 

J qjqj 

Z shall  be  considered  significant  if 
Pmin 
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(s(k')  )2  /s^k'^ 

np  . ' p . P . 
mm  rinin  mm 

s^k'  V(M-k-l)  ~ 

nn  1 v ' 


* Fo 


and  the  variable  Z shall  be  considered  significant  if 
nnax 


U*')  /«(*') 

nq  ' ! q q 

max  hnax  max 

telf7  - ( W )2/s(k,) 


nn 


nq™ 


.ax 


SnaxSrc 


]/(M-k-2) 


> F, 


Lax 


where  F^  and  FQ  are  criteria  based  on  the  F -distribution,  F^  should  not 

be  less  than  F^j  if  it  vere;  looping  might  occur. 

The  procedure  now  tests  whether  Z is  to  be  dropped  from  the 

Pmin 

regression.  There  are  two  options  for  dropping  a variable: 

(1)  If  Z is  not  significant,  it  is  dropped.  (This  may  be 

pmin 

bypassed  by  setting  FQ  equal  to  zero.) 

(2)  The  procedure  alternately  adds  two  variables  and  drops  one. 

If  Z is  not  to  be  dropped,  the  procedure  checks  whether 
pmin 

to  stop  or  not. 

There  are  four  criteria  for  stopping,  the  first  two  of  which  are  now 
checked. 


(1)  If  z is  not  significant,  it  is  added  and  then  the  procedure 

Pmax 

terminates.  (This  may  be  bypassed  by  setting  F^.  to  zero.) 

(2)  When  a specified  maximum  number  of  terms  have  been  included  in 
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the  regression,  the  procedure  terminates*  Unless  otherwise 
specified,  this  will  be  the  number  of  estimation  variables. 

(3)  If  the  square  of  the  multiple  correlation  coefficient  is 

2 

greater  than  a specified  amount  R , the  procedure  ter- 
j max 

minates.  (This  may  be  bypassed  by  setting  R to  1.) 

max 

(4)  When  the  procedure  has  gone  through  a specified  number  of 
iterations,  it  terminates.  If  the  procedure  is  following 
the  option  of  adding  two  variables  and  dropping  one,  this 
will  be  three  times  the  maximum  number  of  terms;  otherwise, 
it  will  be  twice  the  maximum  number  of  terms . 

If  Z is  not  to  be  dropped,  and  if  the  procedure  does  not  stop, 

V • 

■Mnin 

Z is  now  added  to  the  regression. 

Smx 


The  jth  column  of  the  S matrix  corresponds  to  the  (j+n)-th  of  the 
R matrix  if  the  jth  variable  has  been  included  in  the  regression  and 
to  the  jth  column  otherwise.  (At  all  stages,  either  the  jth  column  or 
the  ( j+n)-th  column  of  the  R matrix  will  be  a unit  vector.  The  S matrix 
will  contain  the  column  which  is  not . Of  course  the  storage  of  the  unit 
vector  is  unnecessary.) 

It  will  be  assumed  that  the  qth  variable  is  to  be  added  or  dropped. 

(The  computational  procedure  is  the  same  in  both  cases.  It  will  also 
( v » ^ 

be  assumed  that  H.  ' = -1  if  the  j-th  variable  is  included  in  the 

( k ’ ) 

regression  after  k*  iterations  and  that  HL  ' = + 1 otherwise.  Note 

(k 1 ) ( k 1 ) 

that  ; = + 1 throughout  the  analysis.  IT  } depends  on  the  status 

of  the  qth  variable  before,  rather  than  after  it  is  added  or  dropped.) 


4 


f 


* 
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The  following  formulae  determine  the  matrix 


(k'+l) 


m 


* 


i 


0 


(k!+l) 1 


aa 


(FT 


qq 


(k'+l)  _ bqj 


qj 


(k-) 

(FJ 


j < q 


qq 


(k'+l)  _ Siq. 

Siq  " " ~ 
‘'qq 


(k-) 


(FT 


i > q 


S.  . 

ij 


(k>)  (k«)  (k>)  (k*) 

<k,+1)  - - (k'>  - Jk  \ j < i < 4 


= s.  / 


aa 


sjk'>s  .<*'> 


1J 


(k*+l)  „ (k1) 

- su  ‘ -—-{FT 


j < q < i 


qq 


(k*)e  (k’)H  (k>)  (k«) 

s (k'+i)  _ g (k»)  sig  s jq  Hj  Hg 
ij  ij  T (FI 

qq 


q < j < i 


This  is  equivalent,  on  adding  a variable,  to 

_(k') 

TFT 


Rqj 


(k'+i)  _ \j 


■qq 


r (^ 1 )r  (**) 

(k'+l)  (k'+l)  Kici 

Rij  Rij T1F1 

Rqq 
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or,  on  dropping  a variable,  to 


(k'+l)  _ Rqj 


q_,  q+n 


R (k')R  (k') 

(k'+l)  (k')  i, q+n  qj 

Rij  " ij  " ~ (k^J 

I?  q+n 


where  the  (q+n)-th  column  of  the  R matrix  takes  the  place  of  the  qth  in 
the  S matrix  when  a variable  is  being  added. 

If  the  first  k variables  were  included  in  the  regression,  then  the 
R matrix  would  be  of  the  form. 


1 O R M -S  M -S  S ^ q (k) 

1 0 "Sk+l,l  n-1,1  nl  bll  •**Skl 


0 1 q (k)  a (k)  o (k)  a (k)  (k) 

k+1, k n-l,k  _bnk  bkl  **,bkk 


0 . . . 0 s 


k+l,k+l 


q (k)  q (k)  (k)  (k 

°n-l,k+l  n,k+l  Dk+l,l  •••Dk+l,k 


0.  . . 0 s 


n-1, k+1 


n-1,  n-1 


(k)  (k)  (k 

n,  n-1  n-1, 1 * n-l,k 


0. . . 0 S 


n,  k+1 


n,  n-1 


In  effect  the  program  inverts  the  S matrix  in  place,  proceeding  from 


pivot  element  to  pivot  element  without  rearranging  rows  and  columns.  Als 


ftlB 


< 1 

advantage  is  taken  of  the  symmetry  in  carrying  out  calculations  in  the 
lower  triangle  only. 

At  this  point , a list  of  included  or  active  variables , the  mean- 
squares  due  to  regression  and  to  error,  the  F-ratio,  and  the  square  of 
* the  multiple  correlation  coefficient  are  printed.  There  are  options  for 

printing  the  inverse  matrix,  the  reduced  sum  of  squares  matrix,  the 
partial  regression  coefficients  of  the  dependent  variable  on  each  of 
the  active  variables,  and  the  regression  coefficients  of  the  dependent 
variable  on  the  active  variables. 

/ 

f 


* 
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On  Least  Squares: 
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ON  TH^PKJMERICAL  REPRESENTATION  OP  THE  GENERAL 
SOLUTION  OF  SYSTEMS  OP  ORDINARY  DIFFERENTIAL  EQUATIONS 


By 

Robert  Silber 


A 


I.  INTRODUCTION  AND  SUMMARY 

We  consider  normal  systems  of  first  order,  ordinary* 
differential  equations*  i.e.*  we  consider  the  system 

y±  = f i ( t * y i * y2 * . • ■ * yn ) * 1 = 1,  2*...*  n*  (s) 


in  which  the  dot  indicates  differentiation  with  respect  to  t. 
Let  the  set  of  functions 


v 

Y-^(t*T*T)i*r)2J***jUj^)j  1 = 1 , 2*  •..*  n* 

be  the  gerieral  solution  to  (s)  in  terms  of  the  initial  * 

time  r and  the  initial  values  of  the  y^. 

Under  certain  conditions*  such  as  those  discussed*  the 
functions  Y^  will  be  analytic  at  a selected  point 
(t**T**r)i**r)2*j  • • • ,r)n*)  anci  will  therefore  be  expressible 
in  Taylor's  series  in  n+2  variables  neighboring  the  point 
( t** t** pi** 7]2*j  • • • *r)n*)  • The  information  needed  to  calculate 
the  coefficients  in  this  Taylor's  series  is  the  set  of  values 
of  the  partial  derivatives  of  the  Yp  at  the  point  (t**r**7)i** 
t)2  *.5  • • • s Un* ) • 

Within  the  numerical  procedures  discussed  in  Reference  3, 
there  is  contained  a method  for  obtaining  the  values  of  the 
above  partial  derivatives*  through  any  pre-specified  order.  1 

The  method  necessitates  the  use  of  a digital  computer.  In 
writing  Reference  3,  this  method  was  not  given  explicit  mention* 
because  it  was  integrated  into  a more  complex  numerical  process. 

Since  writing  Reference  3,  the  author  has  come  to  realize  that  4 

perhaps  the  subject  method  is  of  sufficient  interest  to  merit 
an  independent  description.  Thus*  the  purpose  and  content  of 
this  paper  is  a description  of  the  salient  points  of  this 
method;  many  of  the  troublesome  details  and  minutiae  are  left 
untreated*  since  they  are  all  contained  in  Reference  3- 
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II.  THE  FUNDAMENTAL  IDENTITIES 


[ 


9 


v 


The  entire  procedure  is  based  on  two  fundamental  identities 
satisfied  by  the  functions  Y. . Before  writing  the  identities , 
it  will  be  convenient  to  introduce  abbreviated  notation  as 
follows : 


Y = (Yi,  Y2,  . ..,  Yn), 


V = (VitVs*  • • • . %)• 

Thus  Y(t,T,7))  = ( Y 1 (t,T,T]i,7)2,  . . . , T)n  ),...,Yn(t,T,T)1,T)2,  7)n  ). 

The  first  of  the ’ fundamental  identities  is  a consequence 
of  the  Y^  being  solutions  to  (s). 


Y.  (tjTjT))  = f (t,Y(t  ,T,7))  ), 

St  1 X 

1=1, 2,  . . . ,n. 


(1) 


This  is  an  identity  in  each  of  the  n+2  arguments  which 
appear.  In  the  event  that  each  fi  is  analytic  at  the  point 
(t,Y(t ,t ,T))  ) and  each  Yj_  is  analytic  at  the  point  (t ,t,t)), 
the  two  sides  of  (l)  represent  the  same  analytic  function,  and 
new  identities  can  be  obtained  from  (l)  by  unlimited  differ- 
entiation. Thus,  for  example,  using  the  chain  rule. 


S^i 

St2 


(t,T,7)  ) 


Sfi 

(t,Y(t,T,7])) 

ofc 


n Sf.-  SY, 

+ (t  , Y(t,T  ,T)  ) ) (t,T,7)), 

J=1  Sy^  St 


which,  using  (l),  can  be  written 
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Similarly, 


(3) 


and 


(4) 


Clearly,  by  repeated  differentiations,  one  can  obtain 
identities  involving  partial  derivatives  of  higher  orders. 

The  second  of  the  two  fundamental  identities  is  a con- 
sequence of  the  definition  of  the  parameters  r ,r)i  ,r)2  , . . . ,7)  as 
being  "initial  values."  ^ 

(5) 

t 

As  in  (l),  this  is  an  identity  in  each  of  the  n+1  arguments 
appearing,  and  both  sides  can  be  differentiated  indefinitely  at 
points  of  analyticity.  Hence, 

dY±  dY± 

— (t  ,T  ,7]  ) + (T,T,7])  = 0 , 

so  that  by  (l)  and  (5), 


(6) 
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Also, 


(7) 


where  6.,  is  the  Kronecker  delta. 

t ik 

Again,,  as  in  the  case  of  Equation  (l),  further  differ- 
entiations can  be  performed,  yielding  identities  involving 
partial  derivatives  of  progressively  higher  orders. 

In  the  procedure  to  follow.  Equations  (l)-(7),  and  higher 
order  equations  to  be  obtained  through  appropriate  differen- 
tiations, will  be  used. 

III.  REFERENCE  POINTS  AND  REFERENCE  TRAJECTORIES 

As  was  pointed  out  in  the  introduction,  the  aim  of  the 
method  being  described  is  the  expansion  of  the  functions 
Y^,  i=l,2, ...,n,  in  Taylor's  series  about  the  pre-specif led 

( point  (t*,T*,7)i*,7)2*, . . . ,Nn*) . It  is  a clear  necessity  that 

the  functions  Y.  be  analytic  at  this  point.  Analyticity  is 
also  suf ficient  ror  existence  and  convergence  of  the  Taylor's 
series  neighboring  the  point  of  expansion,  but  for  our  method 

* we  shall  require  further  properties.  To  facilitate  the  dis- 
cussions concerned  with  these  properties,  we  introduce  some 
definitions . 

Definition ; A real  solution  of  (s)  over  a real  interval 
[a,b]  is  a set  {cpi  ,cp2 , . . . ,9^  of  real -valued  functions, 
defined  and  differentiable  orr[a,b],  and  satisfying 

<Pi(t)  = f*i(t,<pi  (t),cp2  (t),  . . . ,cpn(t), 

i = 1,  2, . . . , n;  t e [a,b] . 

In  keeping  with  our  earlier  abbreviated  notation,  we  let 
cp  = (cpi  ,cp2  , . . . ,cp  ),  f=(f  i, f2, . . . , f ),  and  write  the  above 

* equation 

cp(t ) = f (t,  cp(t)  ),  t e [a, b ], 

v where,  of  course,  <p  = ( cpi,  cp2, . . . , 9 ) . We  shall  refer  to  cp 

itself  as  the  solution  over  [a,b]. 
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Definition : Suppose  cp  is  a solution  of  s over  [a,b]  . 

The  set 


0(cp*a,b)  = {cp(t)  :t  e [a,b  ]}  , 

which  is  a subset  of  n-dimensional  space,  is  called  the  orbit 
of  cp,  over  [a,b]  . The  set 

£f(cp,a,b)  = {(t,  cp  ( t ) ) : te  [a,b]}  , 

which  is  a subset  of  (n+l)  dimensional  space,  is  called  the 
trajectory  of  cp,  over  [a,b],  A reference  trajectory  is  a 
trajectory  £J(cp, a, b)  of  'a  solution  cp  over  an  interval  [a,b] 
with  the  following  property: 

At  each  point  (t,  cp ( t ) ) e J ( cp,a,b),  each  of  the  functions 
f1,  i=l,2,...,n,  in  (s),  is  analytic*. 

A real  reference  trajectory  is  a reference  trajectory 
d(  cp,  a,b ) for  which  cp  is  real -valued  in  each  component. 
Analyticity,  however,  is  still  taken  in  the  complex  sense. 

(cf . the  definition  below. ) 

Prom  the  theory  of  differential  equations  (References  1 
and  2),  it  is  known  that  if  (r,T]i,r)2,...,Tln)  is  a point  at 
which  each  function  f.,  i=l,2,...,n,  in  (s  ),  is  analytic, 
then  there  exists  a unique  complex  function  cp  of  the  complex 
variable  z which  is  analytic  in  a complex  neighborhood  N of  t, 
which  satisfies  cp("0  = rj  and  which  solves  (s)  at  each  point 
of  N. 


Definition : A point  (t*,r*,r)i*,r]2*, . . . ,71n*)  shall  be 

called  a reference  point  if  the  following  conditions  are  met: 


* fi  is  analytic  at  ( t,  cpi  (t ),  cp2  (t ),...,  cpn  (t )) , if  f±  is 
expressible  by  a power  series 

O^KoViVs  ...  vn(zo-t)V°(z!-cpi(  t))Vl  . . .(zn-cpn(t))Vn, 

which  is  convergant  throughout  an  (n+l)  complex  dimensional 
neighborhood  of  (t,cp(t)),  and  represents  f^  there. 


(i)  Each  f , 1=1 ,2, . . . ,n,  is  analytic  at 

(t*,  771*,  r)2*, . . . , 7)n*  )1and  bounded  on  some  complex  neighborhood 

of  that  point. 

(ii)  Let  cp  be  the  unique  solution  of  (s),  analytic 
at  t*,  and  satisfying  cp(t*)  = p*.  Then  cp  has  an  analytic 
continuation  cp  along  the  real  axis,  from  r*  to  t*. 

(ill)  ^/(<p,r*,t*)  (or  ^J(cp, t*,r*),  if  t*<T*)  is  a 
reference  trajectory. 

From  this  definition,  it  follows  (for  example,  from 
theorem  8.2  in  chapter  one  of  Reference  1)  that  if 

(t*,  r*,  t)i*,  772*, . . . , rj  * ) is  a reference  point,  then  the 
general  solution  Y(t,T,r)i,'q2, . . . ,7)  ) mentioned  in  the  intro- 
duction, is  well-defined  and  analytic  at  each  point  (t 
such  that  ( t , p ) e £/(<p,T*,t*)  and  te[r*,t*],  and  satisfies 

Y(t,T*,7)*)  = cp(t),  for  each  te[r*,t*].  Thus,  each  of  the 
preceeding  differentiations  of  identities  is  justified. 

In  practice,  the  system  (s)  has  the  property  that  its 
solutions  are  real  if  t is  real,  and  if  the  initial  values 
are  real;  consequently,  the  points  in  ^($,T*,t*)  will  have 
real  components  instead  of  complex  components.  Nevertheless, 
a complex,  rather  than  real,  notion  of  analyticity  must  be 
retained,  in  order  to  justify  the  differentiations  on  which 
the  numerical  procedure  is  to  be  based. 


IV.  NUMERICAL  CALCULATIONS 

Let  {t*3  r* , 7)1* , 7)2*,  . . . ,r)  *)  be  a given  reference  point, 
and  let  cp ( t ) = Y(t , t*,t)*),  te  [r*,t*],  as  in  the  preceding 
definitions.  Let  SJ(?  ,T*,t*),  the  reference  trajectory 

defined  by,  and  corresponding  to,  the  given  reference  point. 


By  numerical  integration  on  a digital  computer,  points 
of  $*  can  be  calculated  at  selected  values  of  t in  [r*,t*]  , 
so  that  we  assume  3*  to  be  numerically  known. 

We  now  define  an  nx  n matrix  function  of  time.  Let  F 
be  the  matrix  with  elements  f j_j  defined  by 


fijtt) 


dfi 

(t,cp(t)); 

dyj 


t e [r*,t*]i 

i,  J = 1,  2, 


n . 
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In  keeping  with  popular  terminology,  we  shall  call  F the 
transition  matrix  of  the  system  (s)  along  the  trajectory  Jt, 
Notice  that  since  (s)  is  given,  the  functions 


ayj 


(t,yi  ,Yz, . . .,yn) 


can  be  obtained  by  direct  differentiation  of  the  right  hand 
sides'  of  (s).  Further,  since  is  numerically  known,  we  can 
assume  that  by  further  calculations,  F is  numerically  known 
for  each  te  |T*,t*]  . 


Next,  we  define  an  nx(n+l)  matrix  function  of  time.  Let 

X be  the  matrix  with  elements  x. . defined  by 

J 

S Y ■ 

XjiCt)  = — - (t,T*,7]*)j  t£[T*,t*],  i,  J = 1,2, . . . ,n  , 


C>Yi 

x±j(t)  = (t,T*,7)*);  te  [T*,t*],  i = 1,2,  ...,n, 

■?  _ ^ i i 


% 


< 


\ 


If,  in  Equations  (3)  and  (4),  one  sets  (t,7))  = (t*,7)*), 
then  each  side  of  the  equations  becomes  a function  of  t alone, 
and  partial  derivatives  with  respect  to  t become  total.  Further- 
more, the  equations,  taken  over  all  indices  are  equivalent  to 
the  single  matrix  equation 


X - FX  . (8) 

Furthermore,  Equations  (6)  and  (7)  give  the  entries  of  X(t*); 


X(t*)  = 


1 

0 


0 

1 

0 


0 . . . 0 

0 ...  0 

1 ...  0 


0 ...  1 
; 


column 


-fi  (Ti*,7)*) 

-f*2 


V 


(n+l)st  column 


(9) 


/ 
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Direct  numerical  integration  of  (8)  from  t = r*  to  t = t*, 
using  the  initial  value  given  by  (9),  will  yield,  with  one 
exception,  all  first  partials  needed  for  the  Taylor's  series. 

The  exception  is 

i = 1,  2,  ...,  n, 
d t 

but  this  can  be  calculated  directly  from  the  point  (t*,m(t*)  ) 
in  3*  and  the  right  side  of  Equation  (l). 

The  method  extends  readily  to  higher  order  partials.  The 
analogue  of  (8)  must  be  obtained  by  differentiations  of  (3) 
and  (4),  and  the  analogue  of  (9)  by  differentiations  of  (6)  and 
(7).  The  new  matrix  equations  will  be  of  higher  dimensions 
since  there  are  many  more  second  partials  than  first  partials. 
The  involved  equations  and  determinations  are  treated  in  detail 
in  reference  three,  and  so  are  not  taken  up  here.  It  is  felt 
that  a detailed  description  for  first-order  partials  is  suffi- 
cient to  convey  the  basic  ideas  of  the  method. 
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